Abstract:
Techniques are described for managing distributed execution of programs. In at least some situations, the techniques include decomposing or otherwise separating the execution of a program into multiple distinct execution jobs that may each be executed on a distinct computing node, such as in a parallel manner with each execution job using a distinct subset of input data for the program. In addition, the techniques may include temporarily terminating and later resuming execution of at least some execution jobs, such as by persistently storing an intermediate state of the partial execution of an execution job, and later retrieving and using the stored intermediate state to resume execution of the execution job from the intermediate state. Furthermore, the techniques may be used in conjunction with a distributed program execution service that executes multiple programs on behalf of multiple customers or other users of the service.
Abstract:
Techniques are described for providing clients with access to functionality for creating, configuring and executing defined workflows that manipulate source data in defined manners, such as under the control of a configurable workflow service that is available to multiple remote clients over one or more public networks. A defined workflow for a client may, for example, include multiple interconnected workflow components that are specified by the client and that each are configured to perform one or more types of data manipulation operations on a specified type of input data. The configurable workflow service may further execute the defined workflow at one or more times and in one or more manners, such as in some situations by provisioning multiple computing nodes provided by the configurable workflow service to each implement at least one of the workflow components for the defined workflow.
Abstract:
Methods and apparatus for providing consistent data storage in distributed computing systems. A consistent distributed computing file system (consistent DCFS) may be backed by an object storage service that only guarantees eventual consistency, and may leverage a data storage service (e.g., a database service) to store and maintain a file system/directory structure (a consistent DCFS directory) for the consistent DCFS that may be accessed by compute nodes for file/directory information relevant to the data objects in the consistent DCFS, rather than relying on the information maintained by the object storage service. The compute nodes may reference the consistent DCFS directory to, for example, store and retrieve strongly consistent metadata referencing data objects in the consistent DCFS. The compute nodes may, for example, retrieve metadata from consistent DCFS directory to determine whether the object storage service is presenting all of the data that it is supposed to have.
Abstract:
Methods and systems for cost-minimizing job scheduling are disclosed. A definition of a task is received. The definition comprises a need-by time. The need-by time comprises a deadline for completion of execution of the task. An estimated duration to complete the execution of the task is determined for each of a plurality of computing resources. One or more of the computing resources are selected based on an estimated cost of completing the execution using the computing resources. The execution of the task is initiated at a scheduled time using the selected one or more computing resources. The scheduled time is earlier than the need-by time by at least the estimated duration.
Abstract:
Techniques are described for providing customizable sign-on functionality, such as via an access manager system that provides single sign-on functionality and other functionality to other services for use with those services' users. The access manager system may maintain various sign-on and other account information for various users, and provide single sign-on functionality for those users using that maintained information on behalf of multiple unrelated services with which those users interact. The access manager may allow a variety of types of customizations to single sign-on functionality and/or other functionality available from the access manager, such as on a per-service basis via configuration by an operator of the service, such as co-branding customizations, customizations of information to be gathered from users, customizations of authority that may be delegated to other services to act on behalf of users, etc., and with the customizations that are available being determined specifically for that service.
Abstract:
Methods and systems for optimization of task execution are disclosed. A definition of a task is received. A plurality of parameter values for execution of the task are selected based on an execution history for a plurality of prior tasks performed for a plurality of clients. The plurality of parameter values are selected to optimize one or more execution constraints for the execution of the task. The execution of the task is initiated using one or more computing resources configured with the selected parameter values.
Abstract:
Techniques are described for providing customizable sign-on functionality, such as via an access manager system that provides single sign-on functionality and other functionality to other services for use with those services' users. The access manager system may maintain various sign-on and other account information for various users, and provide single sign-on functionality for those users using that maintained information on behalf of multiple unrelated services with which those users interact. The access manager may allow a variety of types of customizations to single sign-on functionality and/or other functionality available from the access manager, such as on a per-service basis via configuration by an operator of the service, such as co-branding customizations, customizations of information to be gathered from users, customizations of authority that may be delegated to other services to act on behalf of users, etc., and with the customizations that are available being determined specifically for that service.
Abstract:
Techniques are described for managing distributed execution of programs. In some situations, the techniques include determining configuration information to be used for executing a particular program in a distributed manner on multiple computing nodes and/or include providing information and associated controls to a user regarding ongoing distributed execution of one or more programs to enable the user to modify the ongoing distributed execution in various manners. Determined configuration information may include, for example, configuration parameters such as a quantity of computing nodes and/or other measures of computing resources to be used for the executing, and may be determined in various manners, including by interactively gathering values for at least some types of configuration information from an associated user (e.g., via a GUI that is displayed to the user) and/or by automatically determining values for at least some types of configuration information (e.g., for use as recommendations to a user).
Abstract:
Techniques are described for managing distributed execution of programs. In some situations, the techniques include determining configuration information to be used for executing a particular program in a distributed manner on multiple computing nodes and/or include providing information and associated controls to a user regarding ongoing distributed execution of one or more programs to enable the user to modify the ongoing distributed execution in various manners. Determined configuration information may include, for example, configuration parameters such as a quantity of computing nodes and/or other measures of computing resources to be used for the executing, and may be determined in various manners, including by interactively gathering values for at least some types of configuration information from an associated user (e.g., via a GUI that is displayed to the user) and/or by automatically determining values for at least some types of configuration information (e.g., for use as recommendations to a user).
Abstract:
Techniques are described for managing distributed execution of programs. In at least some situations, the techniques include decomposing or otherwise separating the execution of a program into multiple distinct execution jobs that may each be executed on a distinct computing node, such as in a parallel manner with each execution job using a distinct subset of input data for the program. In addition, the techniques may include temporarily terminating and later resuming execution of at least some execution jobs, such as by persistently storing an intermediate state of the partial execution of an execution job, and later retrieving and using the stored intermediate state to resume execution of the execution job from the intermediate state. Furthermore, the techniques may be used in conjunction with a distributed program execution service that executes multiple programs on behalf of multiple customers or other users of the service.