Abstract:
According to some embodiments, a system, method and non-transitory computer-readable medium are provided comprising receiving a node group at an integration module, wherein the node group includes one or more requests for internal data and external data, wherein internal data is data stored in an internal datastore and external data is data stored outside of the internal data store; identifying in a configuration data, one or more meta-data nodes from the node group, wherein each meta-data node indicates a request for external data; retrieving the internal data via execution of an internal data query; determining an order of execution for the one or more meta-data nodes; executing a first meta-data node based on the determined order to generate a first result; ingesting the first result into a semantic datastore; and executing a query to generate a final result, wherein the query includes the retrieved internal data and the ingested first result. Numerous other aspects are provided.
Abstract:
Some embodiments are directed to systems for authoring predictive models. An embodiment includes a computer system implementing a development environment for generating predictive models. The predictive model authoring tool is configured to perform a modeling operation based on one or more user inputs provided to interface controls of the predictive model authoring tool, determine a modeling context for the modeling operation, log the one or more user inputs, generate a predictive model based on one or more model parameters defined during the modeling operation, link the predictive model to an asset, such that one or more sets of data received from the asset are provided to the predictive model during execution of the predictive model, cause the predictive model to be executed such that the predictive model receives data from the asset, and provide the modeling context, the one or more user inputs, and the one or more model parameters.
Abstract:
A service for storing time series data provides a data pipe for receiving time series data, a query pipe for making requests to the service, and a result pipe for receiving output from the service. Data sent to the query pipe is processed by an ingester that prepares metadata indices associated with blocks of incoming time series data and stores the blocks of data in a time series database and the indices in a separate index database. A query layer receives queries from the query pipe and uses the index database to determine which data blocks are needed to process the query, and then requests only those data blocks from the time series database. Processing of the query is performed within the time series database only on those data nodes that contain relevant data, and partial results are passed to an output layer for formation into a final query result which is sent out by the results pipe.
Abstract:
Some embodiments are directed to systems for authoring predictive models. An embodiment includes a computer system implementing a development environment for generating predictive models. The predictive model authoring tool is configured to perform a modeling operation based on one or more user inputs provided to interface controls of the predictive model authoring tool, determine a modeling context for the modeling operation, log the one or more user inputs, generate a predictive model based on one or more model parameters defined during the modeling operation, link the predictive model to an asset, such that one or more sets of data received from the asset are provided to the predictive model during execution of the predictive model, cause the predictive model to be executed such that the predictive model receives data from the asset, and provide the modeling context, the one or more user inputs, and the one or more model parameters.
Abstract:
According to some embodiments, a system and method are provided to extract relationships from unstructured text documents. The method comprises receiving a training set of sentences that comprise labeled objects and subjects for creating an initial relationship model. A set of unlabeled sentences may be received. Objects and subjects from the set of unlabeled sentences are determined based on the initial model and the determined objects and subjects from the set of unlabeled sentences are displayed to a user for feedback and approval. An indication of whether the determined objects and subjects from the set of unlabeled sentences are correct is received and the initial relationship model is updated based on the received indication.
Abstract:
A system and method for searching for and finding data across industrial time-series data is disclosed. A computer system receives a search query from a client system. The computer system accesses a database including a plurality of stored time-series data sets. For each stored time-series data set, the computer system determines whether the stored time-series data set includes one or more sections that match the received search query. In accordance with a determination that two or more of stored time-series data sets include at least one section that matches the received search query, the computer system determines whether the matching sections in each stored time-series data set have overlapping time periods. In accordance with a determination that the matching sections in each time-series data set have overlapping time periods, the computer system identifies a particular event that occurred during the overlapping time periods.
Abstract:
Examples relate to systems for authoring and executing predictive models. A computer system includes a model development context analyzer configured to store a set of derived modeling knowledge generated at least in part from a plurality of modeling operations performed using at least a first predictive model authoring tool. The system is configured to, receive a modeling context indicating at least a modeling operation being performed, determine, from the modeling context, at least one element of an ontology, the ontology defining at least one attribute of a plurality of modeling operations, query the set of derived modeling knowledge using the at least one element of the ontology to identify at least one record of the set of derived modeling knowledge associated with the at least one element of the ontology, identify at least one suggested model parameter associated with the modeling context, and provide the at least one suggested model parameter.
Abstract:
A system includes a processor and a non-transitory computer-readable medium. The non-transitory computer-readable medium comprises instructions executable by the processor to cause the system to perform a method. The method comprises receiving a first job to execute and executing the first job. A plurality of data associated with the first job is determined. The plurality of data comprises data associated with (i) a second job executed immediately prior to the first job, (ii) a third job executed immediately after the first job, (iii) a determination of whether the first job failed or executed successfully and (iv) a type of data associated with the first job. The determined plurality of data is stored.
Abstract:
Methods and systems for optimizing the configuration and parameters of a workflow using an evolutionary approach augmented with intelligent learning capabilities using a Big Data infrastructure. In an embodiment, a Big Data infrastructure receives workflow input parameters, an objective function, a pool of initial configuration parameters, and completion criteria from a client computer, and then runs multiple instances of a workflow based on the pool of initial configuration parameters resulting in corresponding output results. The process includes storing the workflow input parameters and the corresponding output results, modeling the relationship between changes in the workflow input parameters and the corresponding output results, determining that optimal output results have been achieved, and then transmitting the optimal output and the input-output variable relationships results to the client computer.
Abstract:
A system for storing time series data includes an ingester that prepares metadata indices associated with blocks of incoming time series data and stores the blocks of data in a time series database and the indices in a separate index database. The time series database distributes storage of the data blocks among multiple data nodes. A query layer receives queries and uses the index database to determine which data blocks are needed to process the query, and then requests only those data blocks from the time series database. Processing of the query is performed within the time series database only on those data nodes that contain relevant data, and partial results are passed to an output layer for formation into a final query result.