TABLE LOOKUP USING SIMD INSTRUCTIONS
    1.
    发明申请
    TABLE LOOKUP USING SIMD INSTRUCTIONS 审中-公开
    表使用SIMD指令

    公开(公告)号:US20170046156A1

    公开(公告)日:2017-02-16

    申请号:US14826199

    申请日:2015-08-14

    CPC classification number: G06F9/30036 G06F9/30003 G06F9/3004

    Abstract: Systems and methods pertain to looking up entries of a table. A processor receives one or more single instruction multiple data (SIMD) instructions, including a first SIMD instruction which specifies a first subset of indices. A first subset of table entries is looked up, using a crossbar, with the first subset of indices. A first vector output of the first SIMD instruction is based on whether the outputs of the crossbar belong to a desired subset of table entries. Similarly, second, third, and fourth SIMD instructions specify corresponding second, third, and fourth subsets of indices to lookup the remaining table entries using the crossbar. The size of the crossbar is based on the number of indices in the subset of indices used to lookup table entries.

    Abstract translation: 系统和方法属于查找表的条目。 处理器接收一个或多个单指令多数据(SIMD)指令,包括指定索引的第一子集的第一SIMD指令。 使用交叉开关查找表条目的第一个子集,并使用索引的第一个子集。 第一SIMD指令的第一矢量输出基于交叉开关的输出是否属于表条目的期望子集。 类似地,第二,第三和第四SIMD指令指定相应的第二,第三和第四索引子集,以使用横杠来查找剩余的表条目。 交叉开关的大小基于用于查找表条目的索引子集中的索引数。

    SCATTER TO GATHER OPERATION
    2.
    发明申请

    公开(公告)号:US20170371657A1

    公开(公告)日:2017-12-28

    申请号:US15192992

    申请日:2016-06-24

    Abstract: Systems and methods relate to efficient memory operations. A single instruction multiple data (SIMD) gather operation is implemented with a gather result buffer located within or in close proximity to memory, to receive or gather multiple data elements from multiple orthogonal locations in a memory, and once the gather result buffer is complete, the gathered data is transferred to a processor register. A SIMD copy operation is performed by executing two or more instructions for copying multiple data elements from multiple orthogonal source addresses to corresponding multiple destination addresses within the memory, without an intermediate copy to a processor register. Thus, the memory operations are performed in a background mode without direction by the processor.

    COPROCESSOR FOR OUT-OF-ORDER LOADS
    3.
    发明申请
    COPROCESSOR FOR OUT-OF-ORDER LOADS 有权
    用于不合适的负载的共同控制器

    公开(公告)号:US20160092238A1

    公开(公告)日:2016-03-31

    申请号:US14499044

    申请日:2014-09-26

    Abstract: Systems and methods for implementing certain load instructions, such as vector load instructions by cooperation of a main processor and a coprocessor. The load instructions which are identified by the main processor for offloading to the coprocessor are committed in the main processor without receiving corresponding load data. Post-commit, the load instructions are processed in the coprocessor, such that latencies incurred in fetching the load data are hidden from the main processor. By implementing an out-of-order load data buffer associated with an in-order instruction buffer, the coprocessor is also configured to avoid stalls due to long latencies which may be involved in fetching the load data from levels of memory hierarchy, such as L2, L3, L4 caches, main memory, etc.

    Abstract translation: 用于实现某些加载指令的系统和方法,例如通过主处理器和协处理器协作的向量加载指令。 由主处理器识别的用于卸载到协处理器的加载指令在主处理器中提交,而不接收相应的负载数据。 提交后,加载指令在协处理器中进行处理,这样在取出加载数据时产生的延迟从主处理器中隐藏起来。 通过实现与按顺序指令缓冲器相关联的无序负载数据缓冲器,协处理器还被配置为避免由于长时间延迟而导致的延迟,这可能涉及从诸如L2的存储器层级的级别中提取负载数据 ,L3,L4高速缓存,主内存等

Patent Agency Ranking