Abstract:
A partitioning policy, comprising an indication of an initial mapping of data records of a stream to a plurality of partitions, is selected to distribute data records of a data stream among a plurality of nodes of a stream management service. Data ingestion nodes and storage nodes are configured according to the initial mapping. In response to a determination that a triggering criterion for dynamically repartitioning the data stream has been met, a modified mapping is generated, and a different set of ingestion and storage nodes are configured. For at least some time during which arriving data records are stored in accordance with the modified mapping, data records stored at the first set of storage nodes in accordance with the initial mapping are retained.
Abstract:
Idempotent processing of data may be implemented for data records retrieved from a data stream. A data stream may receive data records as input and distribute the ingestion, storage, and processing of the data records amongst one or more partitions of the data stream. Partition metadata may be maintained which includes checkpoint metadata for retrieving, processing, and sending data records in the data stream to a specified destination. When assigned a partition for processing, checkpoint metadata for partition may be accessed to determine whether a pending checkpoint for the partition exists. If not pending checkpoint exists, new data records may be retrieved, processed, and sent from the partition of the data stream to a specified destination. If a checkpoint is pending, then the data records identified by the checkpoint metadata as pending may be retrieved, processed, and sent to the specified destination.
Abstract:
Techniques for analyzing stored video upon a request are described. For example, a method of receiving a first application programming interface (API) request to analyze a stored video, the API request to include a location of the stored video and at least one analysis action to perform on the stored video; accessing the location of the stored video to retrieve the stored video; segmenting the accessed video into chunks; processing each chunk with a chunk processor to perform the at least one analysis action, each chunk processor to utilize at least one machine learning model in performing the at least one analysis action; joining the results of the processing of each chunk to generate a final result; storing the final result; and providing the final result to a requestor in response to a second API request is described.
Abstract:
Managed function execution for processing data streams in real time may be. A function that describes one or more operations to be performed with respect to one or more data streams may be received via programmatic interface for a managed stream processing system. Stream processing nodes capable of applying the function may be determined and execution of the one or more operations may be initiated at the stream processing nodes as data records of the data stream are received. Results of the application of the processing function may be provided to one or more destinations specified for the function. Performance metrics may also be collected for the execution of the function and provided to a client that submitted the function.
Abstract:
A stream management system may implement dynamic management of a data stream. Utilization data of different partitions of a data stream may be tracked. When routing a data record received at the stream management system, a partition may be dynamically identified for the data recorded. The data record may then be directed to the identified partition. Other management operations, such as repartitioning the data stream or reassigning resources for processing data records in the data stream may be performed based on the utilization data tracked for the partitions.
Abstract:
A partitioning policy, comprising an indication of an initial mapping of data records of a stream to a plurality of partitions, is selected to distribute data records of a data stream among a plurality of nodes of a stream management service. Data ingestion nodes and storage nodes are configured according to the initial mapping. In response to a determination that a triggering criterion for dynamically repartitioning the data stream has been met, a modified mapping is generated, and a different set of ingestion and storage nodes are configured. For at least some time during which arriving data records are stored in accordance with the modified mapping, data records stored at the first set of storage nodes in accordance with the initial mapping are retained.
Abstract:
Techniques are described for facilitating use of invocable services by applications in a configurable manner. In at least some situations, the invocable services are Web services or other network-accessible services that are made available by providers of the services for use by others in exchange for fees defined by the service providers. The described techniques facilitate use of such invocable services by applications in a manner configured by the application providers and the service providers, including to track use of third-party invocable services by applications on behalf of end users and to allocate fees that are charged end users between the applications and the services as configured by the providers of the applications and services. In some situations, the configured pricing terms for a service specify fees for end users that differ in one or more ways from the defined fees charged by the provider of that service.
Abstract:
A robotic device management service obtains, from a customer, a first set of parameters of a robotic device and a second set of parameters for a simulation environment for testing a robotic device application installable on the robotic device. The set of parameters are used to indicate a storage location of the application and a selection of a simulation environment for testing the application. In response to the request, the robotic device management service selects a set of resources on which to execute the simulation in the simulation environment. The robotic device management service obtains the robotic device application from the storage location and loads the application on to the set of resources to execute the simulation.
Abstract:
A programmatic interface is implemented, enabling a client of a stream management service to select a data ingestion policy for a data stream. A client request selecting an at-least-once ingestion policy is received. In accordance with the at-least-once policy, a client may transmit an indication of a data record one or more times to the service until a positive acknowledgement is received. In response to receiving a plurality of transmissions indicating a particular data record, respective positive acknowledgements are sent to the client. Based on a persistence policy selected for the stream, copies of the data record are stored at one or more storage locations in response to one particular transmission of the plurality of transmissions.
Abstract:
Disclosed are various embodiments for a framework for time-associated data stream storage, processing, and replication. A plurality of streams of time-associated data are received from a plurality of sources via a network using an application-layer protocol. Each of the plurality of streams is divided into a plurality of fragments. An acknowledgement is sent to each of the plurality of sources for each of the plurality of fragments via the network using the application-layer protocol. Processing is performed on each of the plurality of fragments for individual ones of the plurality of streams. An action is implemented relative to a respective fragment based at least in part on a result of the processing.