Sparse Neural Network Training Optimization
    1.
    发明申请

    公开(公告)号:US20190073590A1

    公开(公告)日:2019-03-07

    申请号:US15694742

    申请日:2017-09-01

    Applicant: Facebook, Inc.

    Abstract: An optimized computer architecture for training an neural network includes a system having multiple GPUs. The neural network may be divided into separate portions, and a different portion is assigned to each of the multiple GPUs. Within each GPU, its portion is further divided across multiple training worker threads in multiple processing cores, and each processing core has lock-free access to a local parameter memory. The local parameter memory of each GPU is separately, and individually, synchronized with a remote master parameter memory by lock memory access. Each GPU has a separate set of communication worker threads dedicated to data transfer between the GPU and the remote parameter memory so that the GPU's training worker threads are not involved with cross GPU communications.

    Self-adaptive control system for dynamic capacity management of latency-sensitive application servers

    公开(公告)号:US10212220B2

    公开(公告)日:2019-02-19

    申请号:US15493532

    申请日:2017-04-21

    Applicant: Facebook, Inc.

    Abstract: A self-adaptive control system based on proportional-integral (PI) control theory for dynamic capacity management of latency-sensitive application servers (e.g., application servers associated with a social networking application) are disclosed. A centralized controller of the system can adapt to changes in request rates, changes in application and/or system behaviors, underlying hardware upgrades, etc., by scaling the capacity of a cluster up or down so that just the right amount of capacity is maintained at any time. The centralized controller uses information relating to a current state of the cluster and historical information relating to past state of the cluster to predict a future state of the cluster and use that prediction to determine whether to scale up or scale down the current capacity to reduce latency and maximize energy savings. A load balancing system can then distribute traffic among the servers in the cluster using any load balancing methods.

    DYNAMICALLY ADAPTING TO DEMAND FOR SERVER COMPUTING RESOURCES

    公开(公告)号:US20170195408A1

    公开(公告)日:2017-07-06

    申请号:US15262986

    申请日:2016-09-12

    Applicant: Facebook, Inc.

    Abstract: Embodiments are described for dynamically responding to demand for server computing resources. The embodiments can monitor performance of each of multiple computing systems in a data center, identify a particular computing system of the multiple computing systems for allocation of additional computing power, determine availability of an additional power supply to allocate to the identified computing system, determine availability of a capacity on a power distribution line connected to the particular computing system to provide the additional power supply to the particular computing system, and allocate the additional computing power to the identified computing system as a function of the determined availability of the additional power supply and the determined availability of the capacity on the power distribution line. The computing systems selected for reducing power consumption can be selected based on a priority order.

    Self-adaptive control system for dynamic capacity management of latency-sensitive application servers

    公开(公告)号:US09667498B2

    公开(公告)日:2017-05-30

    申请号:US14450148

    申请日:2014-08-01

    Applicant: Facebook, Inc.

    Abstract: A self-adaptive control system based on proportional-integral (PI) control theory for dynamic capacity management of latency-sensitive application servers (e.g., application servers associated with a social networking application) are disclosed. A centralized controller of the system can adapt to changes in request rates, changes in application and/or system behaviors, underlying hardware upgrades, etc., by scaling the capacity of a cluster up or down so that just the right amount of capacity is maintained at any time. The centralized controller uses information relating to a current state of the cluster and historical information relating to past state of the cluster to predict a future state of the cluster and use that prediction to determine whether to scale up or scale down the current capacity to reduce latency and maximize energy savings. A load balancing system can then distribute traffic among the servers in the cluster using any load balancing methods.

    Dynamically adapting to demand for server computing resources

    公开(公告)号:US10277523B2

    公开(公告)日:2019-04-30

    申请号:US15262986

    申请日:2016-09-12

    Applicant: Facebook, Inc.

    Abstract: Embodiments are described for dynamically responding to demand for server computing resources. The embodiments can monitor performance of each of multiple computing systems in a data center, identify a particular computing system of the multiple computing systems for allocation of additional computing power, determine availability of an additional power supply to allocate to the identified computing system, determine availability of a capacity on a power distribution line connected to the particular computing system to provide the additional power supply to the particular computing system, and allocate the additional computing power to the identified computing system as a function of the determined availability of the additional power supply and the determined availability of the capacity on the power distribution line. The computing systems selected for reducing power consumption can be selected based on a priority order.

    SELF-ADAPTIVE CONTROL SYSTEM FOR DYNAMIC CAPACITY MANAGEMENT OF LATENCY-SENSITIVE APPLICATION SERVERS
    7.
    发明申请
    SELF-ADAPTIVE CONTROL SYSTEM FOR DYNAMIC CAPACITY MANAGEMENT OF LATENCY-SENSITIVE APPLICATION SERVERS 有权
    自适应应用服务器动态能力管理自适应控制系统

    公开(公告)号:US20150180719A1

    公开(公告)日:2015-06-25

    申请号:US14450148

    申请日:2014-08-01

    Applicant: Facebook, Inc.

    Abstract: A self-adaptive control system based on proportional-integral (PI) control theory for dynamic capacity management of latency-sensitive application servers (e.g., application servers associated with a social networking application) are disclosed. A centralized controller of the system can adapt to changes in request rates, changes in application and/or system behaviors, underlying hardware upgrades, etc., by scaling the capacity of a cluster up or down so that just the right amount of capacity is maintained at any time. The centralized controller uses information relating to a current state of the cluster and historical information relating to past state of the cluster to predict a future state of the cluster and use that prediction to determine whether to scale up or scale down the current capacity to reduce latency and maximize energy savings. A load balancing system can then distribute traffic among the servers in the cluster using any load balancing methods.

    Abstract translation: 公开了一种基于比例积分(PI)控制理论的自适应控制系统,用于对等待时间敏感的应用服务器(例如,与社交网络应用相关联的应用服务器)的动态容量管理。 系统的集中式控制器可以通过将集群的容量上下扩展来适应请求速率的变化,应用和/或系统行为的变化,底层的硬件升级等,从而保持正确的容量。 随时。 集中控制器使用与集群的当前状态有关的信息和与集群的过去状态相关的历史信息来预测集群的未来状态,并使用该预测来确定是放大还是缩小当前容量以减少等待时间 并最大限度地节省能源。 然后,负载平衡系统可以使用任何负载平衡方法在群集中的服务器之间分配流量。

    Sparse neural network training optimization

    公开(公告)号:US10943171B2

    公开(公告)日:2021-03-09

    申请号:US15694742

    申请日:2017-09-01

    Applicant: Facebook, Inc.

    Abstract: An optimized computer architecture for training an neural network includes a system having multiple GPUs. The neural network may be divided into separate portions, and a different portion is assigned to each of the multiple GPUs. Within each GPU, its portion is further divided across multiple training worker threads in multiple processing cores, and each processing core has lock-free access to a local parameter memory. The local parameter memory of each GPU is separately, and individually, synchronized with a remote master parameter memory by lock memory access. Each GPU has a separate set of communication worker threads dedicated to data transfer between the GPU and the remote parameter memory so that the GPU's training worker threads are not involved with cross GPU communications.

    Automatic load balancing for resource allocations

    公开(公告)号:US10459741B2

    公开(公告)日:2019-10-29

    申请号:US15482415

    申请日:2017-04-07

    Applicant: Facebook, Inc.

    Abstract: A computing system operates according to a method including: processing representations of housing structures with open locations for physically locating computing resources, a physical layout of the open locations, and characteristics of the structures and the resources to generate designated locations for optimally placing or allocating the computing resources in the open locations. The designated locations are generated based on analyzing multiple possible allocation or placement combinations of the computing resources into the open locations as an optimization function.

Patent Agency Ranking