Abstract:
A method for accelerating data operations across a plurality of nodes of one or more clusters of a distributed computing environment. Rack awareness information characterizing the plurality of nodes is retrieved and a non-volatile memory (NVM) capability of each node is determined. A write operation is received at a management node of the plurality of nodes and one or more of the rack awareness information and the NVM capability of the plurality of nodes are analyzed to select one or more nodes to receive at least a portion of the write operation, wherein at least one of the selected nodes has an NVM capability. A multicast group for the write operation is then generated wherein the selected nodes are subscribers of the multicast group, and the multicast group is used to perform hardware accelerated read or write operations at one or more of the selected nodes.
Abstract:
Aspects of the technology provide improvements to a Serverless Computing (SLC) workflow by determining when and how to optimize SLC jobs for computing in a Distributed Computing Framework (DCF). DCF optimization can be performed by abstracting SLC tasks into different workflow configurations to determined optimal arrangements for execution in a DCF environment. A process of the technology can include steps for receiving an SLC job including one or more SLC tasks, executing one or more of the tasks to determine a latency metric and a throughput metric for the SLC tasks, and determining if the SLC tasks should be converted to a Distributed Computing Framework (DCF) format based on the latency metric and the throughput metric. Systems and machine-readable media are also provided.
Abstract:
Approaches are disclosed for distributing messages across multiple data centers where the data centers do not store messages using a same message queue protocol. In some embodiment, a network element translates messages from a message queue protocol (e.g., Kestrel, RABBITMQ, APACHE Kafka, and ACTIVEMQ) to an application layer messaging protocol (e.g., XMPP, MQTT, WebSocket protocol, or other application layer messaging protocols). In other embodiments, a network element translates messages from an application layer messaging protocol to a message queue protocol. Using the new approaches disclosed herein, data centers communicate using, at least in part, application layer messaging protocols to disconnect the message queue protocols used by the data centers and enable sharing messages between messages queues in the data centers. Consequently, the data centers can share messages regardless of whether the underlying message queue protocols used by the data centers (and the network devices therein) are compatible with one another.
Abstract:
The present disclosure describes a method for cloud resource placement optimization. A resources monitor monitors state information associated with cloud resources and physical hosts in the federated cloud having a plurality of clouds managed by a plurality of cloud providers. A rebalance trigger triggers a rebalancing request to initiate cloud resource placement optimization based on one or more conditions. A cloud resource placement optimizer determines an optimized placement of cloud resources on physical hosts across the plurality of clouds in the federated cloud based on (1) costs including migration costs, (2) the state information, and (3) constraints, wherein each physical host is identified in the constraints-driven optimization solver by an identifier of a respective cloud provider and an identifier of the physical host. A migrations enforcer determines an ordered migration plan and transmits requests to place or migrate cloud resources according to the ordered migration plan.
Abstract:
Embodiments include obtaining at least one system metric of a distributed storage system, generating one or more recovery parameters based on the at least one system metric, identifying at least one policy associated with data stored in a storage node of a plurality of storage nodes in the distributed storage system, and generating a recovery plan for the data based on the one or more recovery parameters and the at least one policy. In more specific embodiments, the recovery plan includes a recovery order for recovering the data. Further embodiments include initiating a recovery process to copy replicas of the data from a second storage node to a new storage node, wherein the replicas of the data are copied according to the recovery order indicated in the recovery plan.
Abstract:
The present disclosure describes a method for cloud resource placement optimization. A resources monitor monitors state information associated with cloud resources and physical hosts in the federated cloud having a plurality of clouds managed by a plurality of cloud providers. A rebalance trigger triggers a rebalancing request to initiate cloud resource placement optimization based on one or more conditions. A cloud resource placement optimizer determines an optimized placement of cloud resources on physical hosts across the plurality of clouds in the federated cloud based on (1) costs including migration costs, (2) the state information, and (3) constraints, wherein each physical host is identified in the constraints-driven optimization solver by an identifier of a respective cloud provider and an identifier of the physical host. A migrations enforcer determines an ordered migration plan and transmits requests to place or migrate cloud resources according to the ordered migration plan.
Abstract:
A method for assisting evaluation of anomalies in a distributed storage system is disclosed. The method includes a step of monitoring at least one system metric of the distributed storage system. The method further includes steps of maintaining a listing of patterns of the monitored system metric comprising patterns which previously did not result in a failure within one or more nodes of the distributed storage system, and, based on the monitoring, identifying a pattern (i.e., a time series motif) of the monitored system metric as a potential anomaly in the distributed storage system. The method also includes steps of automatically (i.e. without user input) performing a similarity search to determine whether the identified pattern satisfies one or more predefined similarity criteria with at least one pattern of the listing, and, upon positive determination, excepting the identified pattern from being identified as the potential anomaly.
Abstract:
Approaches are disclosed for distributing messages across multiple data centers where the data centers do not store messages using a same message queue protocol. In some embodiment, a network element translates messages from a message queue protocol (e.g., Kestrel, RABBITMQ, APACHE Kafka, and ACTIVEMQ) to an application layer messaging protocol (e.g., XMPP, MQTT, WebSocket protocol, or other application layer messaging protocols). In other embodiments, a network element translates messages from an application layer messaging protocol to a message queue protocol. Using the new approaches disclosed herein, data centers communicate using, at least in part, application layer messaging protocols to disconnect the message queue protocols used by the data centers and enable sharing messages between messages queues in the data centers. Consequently, the data centers can share messages regardless of whether the underlying message queue protocols used by the data centers (and the network devices therein) are compatible with one another.
Abstract:
Approaches are disclosed for distributing messages across multiple data centers where the data centers do not store messages using a same message queue protocol. In some embodiment, a network element translates messages from a message queue protocol (e.g., Kestrel, RABBITMQ, APACHE Kafka, and ACTIVEMQ) to an application layer messaging protocol (e.g., XMPP, MQTT, WebSocket protocol, or other application layer messaging protocols). In other embodiments, a network element translates messages from an application layer messaging protocol to a message queue protocol. Using the new approaches disclosed herein, data centers communicate using, at least in part, application layer messaging protocols to disconnect the message queue protocols used by the data centers and enable sharing messages between messages queues in the data centers. Consequently, the data centers can share messages regardless of whether the underlying message queue protocols used by the data centers (and the network devices therein) are compatible with one another.
Abstract:
Aspects of the technology provide improvements to a Serverless Computing (SLC) workflow by determining when and how to optimize SLC jobs for computing in a Distributed Computing Framework (DCF). DCF optimization can be performed by abstracting SLC tasks into different workflow configurations to determined optimal arrangements for execution in a DCF environment. A process of the technology can include steps for receiving an SLC job including one or more SLC tasks, executing one or more of the tasks to determine a latency metric and a throughput metric for the SLC tasks, and determining if the SLC tasks should be converted to a Distributed Computing Framework (DCF) format based on the latency metric and the throughput metric. Systems and machine-readable media are also provided.