Mechanisms to save user/kernel copy for cross device communications
    1.
    发明授权
    Mechanisms to save user/kernel copy for cross device communications 有权
    保存用于交叉设备通信的用户/内核副本的机制

    公开(公告)号:US09436395B2

    公开(公告)日:2016-09-06

    申请号:US14213640

    申请日:2014-03-14

    摘要: Central processing units (CPUs) in computing systems manage graphics processing units (GPUs), network processors, security co-processors, and other data heavy devices as buffered peripherals using device drivers. Unfortunately, as a result of large and latency-sensitive data transfers between CPUs and these external devices, and memory partitioned into kernel-access and user-access spaces, these schemes to manage peripherals may introduce latency and memory use inefficiencies. Proposed are schemes to reduce latency and redundant memory copies using virtual to physical page remapping while maintaining user/kernel level access abstractions.

    摘要翻译: 计算系统中的中央处理单元(CPU)使用设备驱动程序来管理图形处理单元(GPU),网络处理器,安全协处理器和其他数据重型设备作为缓冲外设。 不幸的是,由于CPU和这些外部设备之间的大型和延迟敏感的数据传输,以及分区为内核访问和用户访问空间的内存,这些管理外设的方案可能会导致延迟和内存使用效率低下。 提出的方案是在维护用户/内核级访问抽象的同时,使用虚拟到物理页面重映射来减少延迟和冗余内存副本。

    Method for memory consistency among heterogeneous computer components
    2.
    发明授权
    Method for memory consistency among heterogeneous computer components 有权
    异构计算机组件之间内存一致性的方法

    公开(公告)号:US09361118B2

    公开(公告)日:2016-06-07

    申请号:US14275271

    申请日:2014-05-12

    IPC分类号: G06F12/00 G06F9/44 G06F9/52

    摘要: A method, computer program product, and system is described that determines the correctness of using memory operations in a computing device with heterogeneous computer components. Embodiments include an optimizer based on the characteristics of a Sequential Consistency for Heterogeneous-Race-Free (SC for HRF) model that analyzes a program and determines the correctness of the ordering of events in the program. HRF models include combinations of the properties: scope order, scope inclusion, and scope transitivity. The optimizer can determine when a program is heterogeneous-race-free in accordance with an SC for HRF memory consistency model. For example, the optimizer can analyze a portion of program code, respect the properties of the SC for HRF model, and determine whether a value produced by a store memory event will be a candidate for a value observed by a load memory event. In addition, the optimizer can determine whether reordering of events is possible.

    摘要翻译: 描述了一种方法,计算机程序产品和系统,其确定在具有异构计算机组件的计算设备中使用存储器操作的正确性。 实施例包括基于用于异构无竞争(SC for HRF)的顺序一致性的特性的优化器,该模型分析程序并确定程序中的事件的顺序的正确性。 HRF模型包括属性的组合:范围顺序,范围包含和范围传递性。 优化器可以根据HR对HRF内存一致性模型的SC来确定程序何时是异构无竞争的。 例如,优化器可以分析程序代码的一部分,尊重SC的HRF模型的属性,并且确定由存储器存储器事件产生的值是否将是由加载存储器事件观察到的值的候选。 此外,优化器可以确定是否可能重新排序事件。

    METHOD FOR MEMORY CONSISTENCY AMONG HETEROGENEOUS COMPUTER COMPONENTS
    3.
    发明申请
    METHOD FOR MEMORY CONSISTENCY AMONG HETEROGENEOUS COMPUTER COMPONENTS 有权
    在异构计算机组件中存储器一致的方法

    公开(公告)号:US20140337587A1

    公开(公告)日:2014-11-13

    申请号:US14275271

    申请日:2014-05-12

    IPC分类号: G06F12/02

    摘要: A method, computer program product, and system is described that determines the correctness of using memory operations in a computing device with heterogeneous computer components. Embodiments include an optimizer based on the characteristics of a Sequential Consistency for Heterogeneous-Race-Free (SC for HRF) model that analyzes a program and determines the correctness of the ordering of events in the program. HRF models include combinations of the properties: scope order, scope inclusion, and scope transitivity. The optimizer can determine when a program is heterogeneous-race-free in accordance with an SC for HRF memory consistency model . For example, the optimizer can analyze a portion of program code, respect the properties of the SC for HRF model, and determine whether a value produced by a store memory event will be a candidate for a value observed by a load memory event. In addition, the optimizer can determine whether reordering of events is possible.

    摘要翻译: 描述了一种方法,计算机程序产品和系统,其确定在具有异构计算机组件的计算设备中使用存储器操作的正确性。 实施例包括基于用于异构无竞争(SC for HRF)的顺序一致性的特性的优化器,该模型分析程序并确定程序中的事件的顺序的正确性。 HRF模型包括属性的组合:范围顺序,范围包含和范围传递性。 优化器可以根据HR对HRF内存一致性模型的SC来确定程序何时是异构无竞争的。 例如,优化器可以分析程序代码的一部分,尊重SC的HRF模型的属性,并且确定由存储器存储器事件产生的值是否将是由加载存储器事件观察到的值的候选。 此外,优化器可以确定是否可能重新排序事件。

    Runtime for automatically load-balancing and synchronizing heterogeneous computer systems with scoped synchronization
    4.
    发明授权
    Runtime for automatically load-balancing and synchronizing heterogeneous computer systems with scoped synchronization 有权
    运行时,用于自动负载均衡和同步异步计算机系统与作用域同步

    公开(公告)号:US09411652B2

    公开(公告)日:2016-08-09

    申请号:US14466594

    申请日:2014-08-22

    摘要: Sharing tasks among compute units in a processor can increase the efficiency of the processor. When a compute unit does not have a task in its task memory to perform, donating tasks from other compute units can prevent the compute unit from being idle while there is task in other parts of the processor. It is desirable to share tasks among compute units that are within defined scopes of the processor. Compute units may share tasks by allowing other compute units to access their private memory, or by donating tasks to a shared memory.

    摘要翻译: 处理器中的计算单元之间的共享任务可以提高处理器的效率。 当计算单元在其任务存储器中没有任务执行时,捐赠来自其他计算单元的任务可能会阻止计算单元在处理器的其他部分中存在任务时处于空闲状态。 希望在处理器的定义范围内的计算单元之间共享任务。 计算单元可以通过允许其他计算单元访问其私有内存或通过将任务捐赠给共享内存来共享任务。

    HIERARCHICAL WRITE-COMBINING CACHE COHERENCE
    5.
    发明申请
    HIERARCHICAL WRITE-COMBINING CACHE COHERENCE 有权
    分层写组合高速缓存的一致性

    公开(公告)号:US20150058567A1

    公开(公告)日:2015-02-26

    申请号:US14010096

    申请日:2013-08-26

    IPC分类号: G06F12/08

    摘要: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read-only cache and write-only combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events and reduces overhead in maintaining write-only combining buffers.

    摘要翻译: 描述了一种方法,计算机程序产品和系统,其强制与特殊访问顺序一致(RCsc)存储器模型的版本一致性,并且执行诸如StRel事件之类的释放同步指令,而不通过存储器层次来跟踪未完成的存储事件,同时有效地使用 带宽资源。 还描述了存储事件与存储事件的顺序相对于RCsc存储器模型的去耦。 该描述还包括一组分层只读缓存和只写组合缓冲器,其将来自系统的不同部分的存储合并。 此外,池组件维护接收到的存储事件的部分顺序并释放同步事件,以避免内容可寻址存储器(CAM)结构,全缓存刷新以及对存储器的直接写入。 该方法提高了全局和本地同步事件的性能,并降低了维持只写组合缓冲区的开销。

    Write combining cache microarchitecture for synchronization events
    6.
    发明授权
    Write combining cache microarchitecture for synchronization events 有权
    为同步事件写入组合缓存微架构

    公开(公告)号:US09477599B2

    公开(公告)日:2016-10-25

    申请号:US13961561

    申请日:2013-08-07

    摘要: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read/write combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events since a store event may not need to reach main memory to complete.

    摘要翻译: 描述了一种方法,计算机程序产品和系统,其强制与特殊访问顺序一致(RCsc)存储器模型的版本一致性,并且执行诸如StRel事件之类的释放同步指令,而不通过存储器层次来跟踪未完成的存储事件,同时有效地使用 带宽资源。 还描述了存储事件与存储事件的顺序相对于RCsc存储器模型的去耦。 该描述还包括一组层次读/写合并缓冲器,其将来自系统的不同部分的存储合并。 此外,池组件维护接收到的存储事件的部分顺序并释放同步事件,以避免内容可寻址存储器(CAM)结构,全缓存刷新以及对存储器的直接写入。 该方法提高了全局和本地同步事件的性能,因为存储事件可能不需要到达主内存才能完成。

    Hierarchical write-combining cache coherence
    7.
    发明授权
    Hierarchical write-combining cache coherence 有权
    分层写入组合高速缓存一致性

    公开(公告)号:US09396112B2

    公开(公告)日:2016-07-19

    申请号:US14010096

    申请日:2013-08-26

    IPC分类号: G06F12/08

    摘要: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read-only cache and write-only combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events and reduces overhead in maintaining write-only combining buffers.

    摘要翻译: 描述了一种方法,计算机程序产品和系统,其强制与特殊访问顺序一致(RCsc)存储器模型的版本一致性,并且执行诸如StRel事件之类的释放同步指令,而不通过存储器层次来跟踪未完成的存储事件,同时有效地使用 带宽资源。 还描述了存储事件与存储事件的顺序相对于RCsc存储器模型的去耦。 该描述还包括一组分层只读缓存和只写组合缓冲器,其将来自系统的不同部分的存储合并。 此外,池组件维护接收到的存储事件的部分顺序并释放同步事件,以避免内容可寻址存储器(CAM)结构,全缓存刷新以及对存储器的直接写入。 该方法提高了全局和本地同步事件的性能,并减少了维持只写组合缓冲区的开销。

    Mechanisms to Save User/Kernel Copy for Cross Device Communications
    8.
    发明申请
    Mechanisms to Save User/Kernel Copy for Cross Device Communications 有权
    保存用于跨设备通信的用户/内核副本的机制

    公开(公告)号:US20150261457A1

    公开(公告)日:2015-09-17

    申请号:US14213640

    申请日:2014-03-14

    IPC分类号: G06F3/06 G06F12/10

    摘要: Central processing units (CPUs) in computing systems manage graphics processing units (GPUs), network processors, security co-processors, and other data heavy devices as buffered peripherals using device drivers. Unfortunately, as a result of large and latency-sensitive data transfers between CPUs and these external devices, and memory partitioned into kernel-access and user-access spaces, these schemes to manage peripherals may introduce latency and memory use inefficiencies. Proposed are schemes to reduce latency and redundant memory copies using virtual to physical page remapping while maintaining user/kernel level access abstractions.

    摘要翻译: 计算系统中的中央处理单元(CPU)使用设备驱动程序来管理图形处理单元(GPU),网络处理器,安全协处理器和其他数据重型设备作为缓冲外设。 不幸的是,由于CPU和这些外部设备之间的大型和延迟敏感的数据传输,以及分区为内核访问和用户访问空间的内存,这些管理外设的方案可能会导致延迟和内存使用效率低下。 提出的方案是在维护用户/内核级访问抽象的同时,使用虚拟到物理页面重映射来减少延迟和冗余内存副本。