Memory address collision detection of ordered parallel threads with bloom filters
    2.
    发明授权
    Memory address collision detection of ordered parallel threads with bloom filters 有权
    带有绽放滤波器的有序并行线程的内存地址冲突检测

    公开(公告)号:US09542193B2

    公开(公告)日:2017-01-10

    申请号:US13730704

    申请日:2012-12-28

    Abstract: A semiconductor chip is described having a load collision detection circuit comprising a first bloom filter circuit. The semiconductor chip has a store collision detection circuit comprising a second bloom filter circuit. The semiconductor chip has one or more processing units capable of executing ordered parallel threads coupled to the load collision detection circuit and the store collision detection circuit. The load collision detection circuit and the store collision detection circuit is to detect younger stores for load operations of said threads and younger loads for store operations of said threads.

    Abstract translation: 描述了一种具有负载碰撞检测电路的半导体芯片,该电路包括第一起爆滤波器电路。 半导体芯片具有存储冲突检测电路,该电路包括第二盛盛滤波器电路。 该半导体芯片具有能够执行与负载碰撞检测电路和存储冲突检测电路耦合的有序并行线程的一个或多个处理单元。 负载碰撞检测电路和存储碰撞检测电路是检测较年轻的存储器用于所述线程和较小负载的负载操作,用于所述线程的存储操作。

    Apparatus and method for a hybrid latency-throughput processor
    3.
    发明授权
    Apparatus and method for a hybrid latency-throughput processor 有权
    用于混合延迟吞吐量处理器的装置和方法

    公开(公告)号:US09417873B2

    公开(公告)日:2016-08-16

    申请号:US13730055

    申请日:2012-12-28

    Abstract: An apparatus and method are described for executing both latency-optimized execution logic and throughput-optimized execution logic on a processing device. For example, a processor according to one embodiment comprises: latency-optimized execution logic to execute a first type of program code; throughput-optimized execution logic to execute a second type of program code, wherein the first type of program code and the second type of program code are designed for the same instruction set architecture; logic to identify the first type of program code and the second type of program code within a process and to distribute the first type of program code for execution on the latency-optimized execution logic and the second type of program code for execution on the throughput-optimized execution logic.

    Abstract translation: 描述了用于在处理设备上执行延迟优化的执行逻辑和吞吐量优化的执行逻辑的装置和方法。 例如,根据一个实施例的处理器包括:执行第一类型的程序代码的等待时间优化的执行逻辑; 吞吐量优化执行逻辑以执行第二类型的程序代码,其中所述第一类型的程序代码和所述第二类型的程序代码被设计用于相同的指令集架构; 识别过程中的第一类型的程序代码和第二类型的程序代码的逻辑,并且将用于执行的第一类型的程序代码分配在延迟优化的执行逻辑和第二类型的程序代码上以便在吞吐量 - 优化的执行逻辑。

    AN APPARATUS TO REDUCE IDLE LINK POWER IN A PLATFORM
    4.
    发明申请
    AN APPARATUS TO REDUCE IDLE LINK POWER IN A PLATFORM 审中-公开
    降低平台中空闲链路功率的设备

    公开(公告)号:US20160109925A1

    公开(公告)日:2016-04-21

    申请号:US14978340

    申请日:2015-12-22

    Abstract: A system on a chip (SoC) is provided including processing cores and a root complex. The transaction requests are communicated between a root port of the root complex and a device, the root port including electrical idle (EI) exit detect circuitry and a reference clock source. The root port supports a first link state, in which the reference clock source and EI exit detect circuitry of the root port are disabled but a common mode voltage is maintained, and a second link state, in which the reference clock source and EI exit detect circuitry are disabled and the common mode voltage is not maintained. The root port transitions to the first link state based on a service latency requirement of the device being less than a threshold and to the second link state based on the service latency requirement being greater than or equal to the threshold.

    Abstract translation: 提供了一种芯片系统(SoC),包括处理核心和根系统。 事务请求在根组合的根端口和设备之间传送,根端口包括电空闲(EI)出口检测电路和参考时钟源。 根端口支持第一链路状态,其中根端口的参考时钟源和EI出口检测电路被禁用,但保持共模电压,第二链路状态,其中参考时钟源和EI退出检测 电路被禁用,并且不保持共模电压。 根据服务等待时间要求小于阈值,根据服务等待时间要求大于或等于阈值,根端口转换到第一链路状态。

    Apparatus and method for a hybrid latency-throughput processor

    公开(公告)号:US10664284B2

    公开(公告)日:2020-05-26

    申请号:US16289075

    申请日:2019-02-28

    Abstract: An apparatus and method are described for executing both latency-optimized execution logic and throughput-optimized execution logic on a processing device. For example, a processor according to one embodiment comprises: latency-optimized execution logic to execute a first type of program code; throughput-optimized execution logic to execute a second type of program code, wherein the first type of program code and the second type of program code are designed for the same instruction set architecture; logic to identify the first type of program code and the second type of program code within a process and to distribute the first type of program code for execution on the latency-optimized execution logic and the second type of program code for execution on the throughput-optimized execution logic.

    Instruction and logic for detecting numeric accumulation error

    公开(公告)号:US10146533B2

    公开(公告)日:2018-12-04

    申请号:US15280564

    申请日:2016-09-29

    Abstract: A processor includes circuitry to decode at least one instruction and an execution unit. The decoded instruction may compute a floating point result. The execution unit includes circuitry to execute the instruction to determine the floating point result, compute the amount of precision lost in a mantissa of the floating point result, compare the amount of precision lost to a numeric accumulation error precision threshold, determine whether a numeric accumulation error occurred based on the comparison, and write a value to a flag. The amount of precision lost corresponds to a plurality of bits lost in the mantissa of the floating point result. The value to be written to the flag may be based on the determination that the numeric accumulation error occurred. The flag may be for notification that the numeric accumulation error occurred.

    Memory address collision detection of ordered parallel threads with bloom filters

    公开(公告)号:US10101999B2

    公开(公告)日:2018-10-16

    申请号:US15403101

    申请日:2017-01-10

    Abstract: A semiconductor chip is described having a load collision detection circuit comprising a first bloom filter circuit. The semiconductor chip has a store collision detection circuit comprising a second bloom filter circuit. The semiconductor chip has one or more processing units capable of executing ordered parallel threads coupled to the load collision detection circuit and the store collision detection circuit. The load collision detection circuit and the store collision detection circuit is to detect younger stores for load operations of said threads and younger loads for store operations of said threads.

    Apparatus and method for enforcement of reserved bits

    公开(公告)号:US09934090B2

    公开(公告)日:2018-04-03

    申请号:US14979316

    申请日:2015-12-22

    Abstract: An apparatus and method are described for enforcement of reserved bits. For example, one embodiment of a processor comprises: a memory management unit to store a set of bits including a set of reserved bits to a system memory; reserved bit enforcement logic to generate a pseudo-random pattern in the reserved bits and an error correction code over the pseudo-random pattern prior to storing the reserved bits; the memory management unit to load the reserved bits including the pseudo-random pattern and the error correction code; the reserved bit enforcement logic to use the error correction code to determine whether the reserved bits have been modified by software; and if the reserved bits have been modified, then the processor to generate an error condition and if not modified, then the processor to continue normal execution.

    Instruction and Logic for Detecting Numeric Accumulation Error

    公开(公告)号:US20180088941A1

    公开(公告)日:2018-03-29

    申请号:US15280564

    申请日:2016-09-29

    Abstract: A processor includes circuitry to decode at least one instruction and an execution unit. The decoded instruction may compute a floating point result. The execution unit includes circuitry to execute the instruction to determine the floating point result, compute the amount of precision lost in a mantissa of the floating point result, compare the amount of precision lost to a numeric accumulation error precision threshold, determine whether a numeric accumulation error occurred based on the comparison, and write a value to a flag. The amount of precision lost corresponds to a plurality of bits lost in the mantissa of the floating point result. The value to be written to the flag may be based on the determination that the numeric accumulation error occurred. The flag may be for notification that the numeric accumulation error occurred.

    APPARATUS AND METHOD FOR ENFORCEMENT OF RESERVED BITS

    公开(公告)号:US20170177439A1

    公开(公告)日:2017-06-22

    申请号:US14979316

    申请日:2015-12-22

    Abstract: An apparatus and method are described for enforcement of reserved bits. For example, one embodiment of a processor comprises: a memory management unit to store a set of bits including a set of reserved bits to a system memory; reserved bit enforcement logic to generate a pseudo-random pattern in the reserved bits and an error correction code over the pseudo-random pattern prior to storing the reserved bits; the memory management unit to load the reserved bits including the pseudo-random pattern and the error correction code; the reserved bit enforcement logic to use the error correction code to determine whether the reserved bits have been modified by software; and if the reserved bits have been modified, then the processor to generate an error condition and if not modified, then the processor to continue normal execution.

Patent Agency Ranking