Generating and using checkpoints in a virtual computer system

    公开(公告)号:US10859289B2

    公开(公告)日:2020-12-08

    申请号:US15662071

    申请日:2017-07-27

    申请人: VMware, Inc.

    IPC分类号: G06F11/00 F24H1/14 F16L53/32

    摘要: To generate a checkpoint for a virtual machine (VM), first, while the VM is still running, a copy-on-write (COW) disk file is created pointing to a parent disk file that the VM is using. Next, the VM is stopped, the VM's memory is marked COW, the device state of the VM is saved to memory, the VM is switched to use the COW disk file, and the VM begins running again for substantially the remainder of the checkpoint generation. Next, the device state that was stored in memory and the unmodified VM memory pages are saved to a checkpoint file. Also, a copy may be made of the parent disk file for retention as part of the checkpoint, or the original parent disk file may be retained as part of the checkpoint. If a copy of the parent disk file was made, then the COW disk file may be committed to the original parent disk file.

    Resource allocation in computers
    2.
    发明授权

    公开(公告)号:US10430094B2

    公开(公告)日:2019-10-01

    申请号:US15918463

    申请日:2018-03-12

    申请人: VMware, Inc.

    摘要: A method and tangible medium embodying code for allocating resource units of an allocatable resource among a plurality of clients in a computer is described. In the method, resource units are initially distributed among the clients by assigning to each of the clients a nominal share of the allocatable resource. For each client, a current allocation of resource units is determined. A metric is evaluated for each client, the metric being a function both of the nominal share and a usage-based factor, the usage-based factor being a function of a measure of resource units that the client is actively using and a measure of resource units that the client is not actively using. A resource unit can be reclaimed from a client when the metric for that client meets a predetermined criterion.

    Efficient online construction of miss rate curves
    3.
    发明授权
    Efficient online construction of miss rate curves 有权
    有效率在线构建失误率曲线

    公开(公告)号:US09223722B2

    公开(公告)日:2015-12-29

    申请号:US14196100

    申请日:2014-03-04

    申请人: VMware, Inc.

    摘要: Miss rate curves are constructed in a resource-efficient manner so that they can be constructed and memory management decisions can be made while the workloads are running. The resource-efficient technique includes the steps of selecting a subset of memory pages for the workload, maintaining a least recently used (LRU) data structure for the selected memory pages, detecting accesses to the selected memory pages and updating the LRU data structure in response to the detected accesses, and generating data for constructing a miss-rate curve for the workload using the LRU data structure. After a memory page is accessed, the memory page may be left untraced for a period of time, after which the memory page is retraced.

    摘要翻译: 错误率曲线以资源有效的方式构建,以便可以构建它们,并且可以在工作负载运行时进行内存管理决策。 资源有效的技术包括以下步骤:为工作负载选择存储器页面的子集,维护所选择的存储器页面的最近最少使用的(LRU)数据结构,检测对所选择的存储器页面的访问并响应更新LRU数据结构 并且使用LRU数据结构生成用于构建工作负载的错过率曲线的数据。 在访问存储器页面之后,存储器页面可以保持未被跟踪一段时间,之后再回读存储器页面。

    Transparent recovery from hardware memory errors
    5.
    发明授权
    Transparent recovery from hardware memory errors 有权
    从硬件内存错误中恢复透明

    公开(公告)号:US08775903B2

    公开(公告)日:2014-07-08

    申请号:US13893465

    申请日:2013-05-14

    申请人: VMware, Inc.

    IPC分类号: G06F11/00

    CPC分类号: G06F11/08 G06F11/141

    摘要: A method is provided for recovering from an uncorrected memory error located at a memory address as identified by a memory device. A stored hash value for a memory page corresponding to the identified memory address is used to determine the correct data. Because the memory device specifies the location of the corrupted data, and the size of the window where the corruption occurred, the stored hash can be used to verify memory page reconstruction. With the known good part of the data in hand, the hashes of the pages using possible values in place of the corrupted data are calculated. It is expected that there will be a match between the previously stored hash and one of the computed hashes. As long as there is one and only one match, then that value, used in the place of the corrupted data, is the correct value. The corrupt data, once replaced, allows operation of the memory device to continue without needing to interrupt or otherwise affect a system's operation.

    摘要翻译: 提供一种用于从由存储器件识别的存储器地址处的未校正的存储器错误中恢复的方法。 用于与所识别的存储器地址相对应的存储器页的存储的散列值用于确定正确的数据。 由于内存设备指定损坏的数据的位置以及发生损坏的窗口的大小,因此可以使用存储的散列来验证内存页重建。 利用手头已知的很好的部分数据,可以计算使用可能值代替已损坏数据的页面散列。 预期在先前存储的散列和所计算的散列之一将存在匹配。 只要有一个只有一个匹配,那么在损坏的数据的位置使用的值是正确的值。 损坏的数据一旦被更换,就允许存储设备的操作继续进行,而不需要中断或以其他方式影响系统的操作。

    Decentralized input/output resource management
    8.
    发明授权
    Decentralized input/output resource management 有权
    分散投入/产出资源管理

    公开(公告)号:US09509621B2

    公开(公告)日:2016-11-29

    申请号:US14263231

    申请日:2014-04-28

    申请人: VMware, Inc.

    摘要: A shared input/output (IO) resource is managed in a decentralized manner. Each of multiple hosts having IO access to the shared resource, computes an average latency value that is normalized with respect to average IO request sizes, and stores the computed normalized latency value for later use. The normalized latency values thus computed and stored may be used for a variety of different applications, including enforcing a quality of service (QoS) policy that is applied to the hosts, detecting a condition known as an anomaly where a host that is not bound by a QoS policy accesses the shared resource at a rate that impacts the level of service received by the plurality of hosts that are bound by the QoS policy, and migrating workloads between storage arrays to achieve load balancing across the storage arrays.

    摘要翻译: 共享的输入/输出(IO)资源以分散的方式进行管理。 具有对共享资源的IO访问权的多个主机中的每一个计算相对于平均IO请求大小进行归一化的平均延迟值,并且存储所计算的归一化等待时间值以备以后使用。 如此计算和存储的归一化等待时间值可以用于各种不同的应用,包括实施应用于主机的服务质量(QoS)策略,检测被称为异常的状况,其中不受 QoS策略以影响由QoS策略约束的多个主机接收的服务等级的速率访问共享资源,以及在存储阵列之间迁移工作负载以实现跨存储阵列的负载平衡。

    Resource allocation in computers
    9.
    发明授权
    Resource allocation in computers 有权
    电脑资源分配

    公开(公告)号:US09363197B2

    公开(公告)日:2016-06-07

    申请号:US14295191

    申请日:2014-06-03

    申请人: VMware, Inc.

    摘要: A method and tangible medium embodying code for allocating resource units of an allocatable resource among a plurality of clients in a computer is described. In the method, resource units are initially distributed among the clients by assigning to each of the clients a nominal share of the allocatable resource. For each client, a current allocation of resource units is determined. A metric is evaluated for each client, the metric being a function both of the nominal share and a usage-based factor, the usage-based factor being a function of a measure of resource units that the client is actively using and a measure of resource units that the client is not actively using. A resource unit can be reclaimed from a client when the metric for that client meets a predetermined criterion.

    摘要翻译: 描述了一种体现用于在计算机中的多个客户端中分配可分配资源的资源单元的代码的方法和有形介质。 在该方法中,通过向每个客户端分配可分配资源的标称份额,资源单元最初分布在客户端之间。 对于每个客户端,确定资源单元的当前分配。 针对每个客户端评估度量,度量是名义份额和基于使用的因素的函数,基于使用的因素是客户端正在使用的资源单位的度量的度量和资源的度量 客户端没有积极使用的单位。 当客户端的度量符合预定标准时,可以从客户端回收资源单元。