INSERTING A PROXY READ INSTRUCTION IN AN INSTRUCTION PIPELINE IN A PROCESSOR

    公开(公告)号:US20220365780A1

    公开(公告)日:2022-11-17

    申请号:US16983445

    申请日:2020-08-03

    Abstract: Inserting a proxy read instruction in an instruction pipeline in a processor is disclosed. A scheduler circuit is configured to recognize when a produced value generated by execution of a producer instruction in the instruction pipeline will not be available through a data forwarding path to be consumed for processing of a subsequent consumer instruction. In this case, the scheduling circuit is configured to insert a proxy read instruction in the instruction pipeline to cause execution of an operation to generate the same produced value as was generated by previous execution of producer instruction in the instruction pipeline. Thus, the produced value will remain available in the instruction pipeline to again be available through a data forwarding path to an earlier stage of the instruction pipeline to be consumed by a consumer instruction, which may avoid a pipeline stall.

    POWER CONTROL BASED ON PERFORMANCE MODIFICATION THROUGH PULSE MODULATION

    公开(公告)号:US20210096635A1

    公开(公告)日:2021-04-01

    申请号:US16669898

    申请日:2019-10-31

    Abstract: Systems and methods for power control based on performance modification through pulse modulation include an integrated circuit (IC) that may evaluate certain limit conditions within a computing device and compare the limit conditions to corresponding predefined thresholds. When a given predefined threshold is exceeded, an overage signal may be sent to a limits management circuit within the initial IC or another IC. The limits management circuit may generate a single-bit throttle signal through a pulse modulation circuit. The single-bit throttle signal may modify internal processing of an associated processor, which in turn changes power consumption.

    Power control based on performance modification through pulse modulation

    公开(公告)号:US11586272B2

    公开(公告)日:2023-02-21

    申请号:US16669898

    申请日:2019-10-31

    Abstract: Systems and methods for power control based on performance modification through pulse modulation include an integrated circuit (IC) that may evaluate certain limit conditions within a computing device and compare the limit conditions to corresponding predefined thresholds. When a given predefined threshold is exceeded, an overage signal may be sent to a limits management circuit within the initial IC or another IC. The limits management circuit may generate a single-bit throttle signal through a pulse modulation circuit. The single-bit throttle signal may modify internal processing of an associated processor, which in turn changes power consumption.

    SCALABLE SINGLE-INSTRUCTION-MULTIPLE-DATA INSTRUCTIONS
    5.
    发明申请
    SCALABLE SINGLE-INSTRUCTION-MULTIPLE-DATA INSTRUCTIONS 审中-公开
    可扩展的单指令 - 多数据指令

    公开(公告)号:US20170046168A1

    公开(公告)日:2017-02-16

    申请号:US14827170

    申请日:2015-08-14

    Abstract: A method for executing scalable single-instruction-multiple-data (SIMD) instructions includes performing a query to determine a hardware vector length of a SIMD processor. The method also includes scaling a first instruction of the scalable SIMD instructions to a first scaled vector length to generate a first scaled instruction. The first scaled vector length is based on the hardware vector length, and the first instruction is a compiled instruction having an adaptable vector length. The method also includes adjusting a first number of iterations to be used by the SIMD processor to perform first operations associated with the first instruction based on the first scaled vector length.

    Abstract translation: 执行可伸缩单指令多数据(SIMD)指令的方法包括执行查询以确定SIMD处理器的硬件向量长度。 该方法还包括将可伸缩SIMD指令的第一指令缩放到第一缩放矢量长度以产生第一缩放指令。 第一个缩放矢量长度基于硬件矢量长度,第一个指令是具有适应矢量长度的编译指令。 该方法还包括调整要由SIMD处理器使用的第一数量的迭代,以基于第一缩放向量长度执行与第一指令相关联的第一操作。

    Memory access management
    6.
    发明授权

    公开(公告)号:US11669273B2

    公开(公告)日:2023-06-06

    申请号:US17166263

    申请日:2021-02-03

    CPC classification number: G06F3/0659 G06F3/0622 G06F3/0673

    Abstract: A device includes a scoreboard and a processor. The scoreboard includes scoreboard entries configured to store information regarding one or more uncompleted memory access operations. The scoreboard also includes a dependency matrix configured to store dependency information corresponding to the scoreboard entries. The processor is configured to retrieve a first memory access instruction that indicates a first address range of a first memory access operation, and to add an indication of the first memory access instruction to a first scoreboard entry. The processor is further configured to, based on determining that the first address range at least partially overlaps a second address range associated with a second scoreboard entry that corresponds to a second memory access instruction, set an element of the dependency matrix to have a has-dependency value indicating a dependency of the first scoreboard entry on the second scoreboard entry.

    Inserting a proxy read instruction in an instruction pipeline in a processor

    公开(公告)号:US11609764B2

    公开(公告)日:2023-03-21

    申请号:US16983445

    申请日:2020-08-03

    Abstract: Inserting a proxy read instruction in an instruction pipeline in a processor is disclosed. A scheduler circuit is configured to recognize when a produced value generated by execution of a producer instruction in the instruction pipeline will not be available through a data forwarding path to be consumed for processing of a subsequent consumer instruction. In this case, the scheduling circuit is configured to insert a proxy read instruction in the instruction pipeline to cause execution of an operation to generate the same produced value as was generated by previous execution of producer instruction in the instruction pipeline. Thus, the produced value will remain available in the instruction pipeline to again be available through a data forwarding path to an earlier stage of the instruction pipeline to be consumed by a consumer instruction, which may avoid a pipeline stall.

    Multi-thread power limiting via shared limit

    公开(公告)号:US11287872B2

    公开(公告)日:2022-03-29

    申请号:US16829942

    申请日:2020-03-25

    Abstract: Systems and methods for multi-thread power limiting via a shared limit estimates power consumed in a processing core on a thread-by-thread basis by counting how many power events occur in each thread. Power consumed by each thread is approximated based on the number of power events that have occurred. Power consumed by individual threads is compared to a shared power limit derived from a sum of the power consumed by all threads. Threads that are above the shared power limit are stalled while threads below the shared power limit are allowed to continue without throttling. In this fashion, the most power intensive threads are throttled to stay below the shared power limit while still maintaining performance.

    Controlling voltage deviations in processing systems

    公开(公告)号:US10152101B2

    公开(公告)日:2018-12-11

    申请号:US14860715

    申请日:2015-09-22

    Abstract: Systems and methods relate to controlling voltage deviations in processing systems. A scheduler receives transactions and to be issued for execution in a pipeline. A voltage deviation that will occur if a particular transaction is executed in the pipeline is estimated before the transaction is issued. Threshold comparators are used to determine if the estimated voltage deviation will exceed specified thresholds to cause voltage overshoots or undershoots. The scheduler is configured to implement one or more corrective measures, such as increasing or decreasing energy in the pipeline, to mitigate possible voltage overshoots or undershoots, before the transaction is issued to be executed in the pipeline.

    EXECUTION HARDWARE FOR LOAD AND STORE OPERATION ALIGNMENT
    10.
    发明申请
    EXECUTION HARDWARE FOR LOAD AND STORE OPERATION ALIGNMENT 审中-公开
    执行加载和存储操作对齐的硬件

    公开(公告)号:US20160364147A1

    公开(公告)日:2016-12-15

    申请号:US14738631

    申请日:2015-06-12

    Abstract: An apparatus includes an execution unit configured to modify register aligned data having a first portion of a vector of data and a second portion of the vector of data to generate modified data. The vector of data is stored in a register file prior to modification. The execution unit is further configured to generate first data and second data based on the modified data. The first data includes the first portion of the vector of data, and the second data includes the second portion of the vector of data. A memory unit is operable to store the first data at a first portion of the memory unit and to store the second data at a second portion of the memory unit. The register aligned data is unaligned with respect to the first portion of the memory unit and unaligned with respect to the second portion of the memory unit.

    Abstract translation: 一种装置包括:执行单元,被配置为修改具有数据向量的第一部分和数据向量的第二部分的寄存器对准数据,以生成修改的数据。 数据向量在修改之前存储在寄存器文件中。 执行单元还被配置为基于修改的数据生成第一数据和第二数据。 第一数据包括数据向量的第一部分,第二数据包括数据向量的第二部分。 存储器单元可操作以将第一数据存储在存储器单元的第一部分处,并将第二数据存储在存储器单元的第二部分。 寄存器对齐数据相对于存储器单元的第一部分是未对准的,并且相对于存储器单元的第二部分不对准。

Patent Agency Ranking