Abstract:
A partitioning policy, comprising an indication of an initial mapping of data records of a stream to a plurality of partitions, is selected to distribute data records of a data stream among a plurality of nodes of a stream management service. Data ingestion nodes and storage nodes are configured according to the initial mapping. In response to a determination that a triggering criterion for dynamically repartitioning the data stream has been met, a modified mapping is generated, and a different set of ingestion and storage nodes are configured. For at least some time during which arriving data records are stored in accordance with the modified mapping, data records stored at the first set of storage nodes in accordance with the initial mapping are retained.
Abstract:
A configuration request comprising a security option selected for a particular data stream is received. Nodes of a plurality of functional categories, such as a data ingestion category and a data retrieval category are to be configured for the stream. The security option indicates a security profile of a resource to be used for nodes of at least one functional category. In accordance with the configuration request, a node of a first functional category is configured at a resource with a first security profile, and configuration of a node of a second functional category is initiated at a different resource with a different security profile.
Abstract:
A programmatic interface is implemented, enabling a client of a stream management service to select a data ingestion policy for a data stream. A client request selecting an at-least-once ingestion policy is received. In accordance with the at-least-once policy, a client may transmit an indication of a data record one or more times to the service until a positive acknowledgement is received. In response to receiving a plurality of transmissions indicating a particular data record, respective positive acknowledgements are sent to the client. Based on a persistence policy selected for the stream, copies of the data record are stored at one or more storage locations in response to one particular transmission of the plurality of transmissions.
Abstract:
A control node of a multi-tenant stream processing service receives a request indicating an operation to be performed on data records of a particular data stream. Based on a stream partitioning policy, the control node determines an initial number of worker nodes to be used. The control node configures a worker node to perform the operation on received records. In response to a determination that the worker node is in an unhealthy state, the control node configures a replacement worker node.
Abstract:
Managed function execution for processing data streams in real time may be. A function that describes one or more operations to be performed with respect to one or more data streams may be received via programmatic interface for a managed stream processing system. Stream processing nodes capable of applying the function may be determined and execution of the one or more operations may be initiated at the stream processing nodes as data records of the data stream are received. Results of the application of the processing function may be provided to one or more destinations specified for the function. Performance metrics may also be collected for the execution of the function and provided to a client that submitted the function.
Abstract:
A control node of a multi-tenant stream management service receives a request to initialize a data stream to be comprised of a plurality of data records. The control node determines, based on a partitioning policy, parameters to be used to configure subsystems for ingestion, storage and retrieval of the records. The control node identifies resources to be used for a node of retrieval subsystem The retrieval node is configured to implement programmatic record retrieval interfaces, including respective interfaces to implement non-sequential and sequential access patterns. The control node configures the retrieval node using the selected resources.