Abstract:
Cost-based optimization of configuration parameters and cluster sizing for distributed data processing systems are disclosed. According to an aspect, a method includes receiving at least one job profile of a MapReduce job. The method also includes using the at least one job profile to predict execution of the MapReduce job within a plurality of different predetermined settings of a distributed data processing system. Further, the method includes determining one of the predetermined settings that optimizes performance of the MapReduce job. The method may also include automatically adjusting the distributed data processing system to the determined predetermined setting.
Abstract:
A facility for representing in a relational database informational content of the series of tag-language messages is described. The facility reads an arbitrary number of the tags contained by the series of messages. For each read tag, the facility determines a path for the tag, the name of a relational table assigned to the path, values specified for the tag and/or attributes of the tag, in the name of columns of relational tables assigned to the tag and/or tag attribute values. After doing this processing, the facility updates the relational database so that and it includes all of the assigned relational tables and a relational table columns, and populates the value specified for the tags and/or attributes of tags to the relational database.
Abstract:
The current document is directed to an interface and authorization service that allows users of a cloud-director management subsystem of distributed, multi-tenant, virtual data centers to extend the services and functionalities provided by the cloud-director management subsystem. A cloud application programming interface (“API”) entrypoint represents a request/response RESTful interface to services and functionalities provided by the cloud-director management subsystem as well as to service extensions provided by users. The cloud API entrypoint includes a service-extension interface and an authorization-service management interface. The cloud-director management subsystem provides the authorization service to service extensions that allow the service extensions to obtain, from the authorization service, an indication of whether or not a request directed to the service extension through the cloud API entrypoint is authorized.
Abstract:
An integrated network flow and security information management system and method is provided, more particularly, an integrated network flow and security information management system and method which leverages a process of superimposing and cross referencing common events and attributes in order to increase the speed of searches, completeness of searches and size of dataset (flow data). In particular, the process of superimposing may increase the amount of information that can be processed, while accelerating the search, thereby providing the user with more responsive acts of pivoting and scoping leading to a more complete response to network errors and threats.
Abstract:
In a question-answering (QA) environment, a first answer sequence is identified. As identified, the first answer sequence includes a first answer and a second answer. A corpus is analyzed using the first answer and the second answer. Based on the analysis, a set of influence factors corresponding to both the first answer and the second answer are identified. A first answer relationship between the first answer and the second answer is then generated based on the set of influence factors.
Abstract:
A method, device, and non-transitory computer-readable storage medium are provided for efficiently registering a relational schema. In co-compilation and data guide approaches, a subset of entities from schema descriptions are selected for physical registration, and other entities from the schema descriptions are not physically registered. In the co-compilation approach, a first schema description references a second schema description, and the subset includes a set of entities from the second schema description that are used by the first schema description. In the data guide approach, the subset includes entities that are used by a set of structured documents. In a pay-as-you-go approach, schema registration includes logically registering entities without creating relational database structures corresponding to the entities. A database server may execute database commands that reference the logically registered entities. A request to store data for the entities may be executed by creating relational database structures to store the data.
Abstract:
A method and apparatus for accessing a graph database having nodes and relationships describing an organization. An interface in a computer system receives a request from a client to access information about the organization. Further, the interface in the computer system retrieves the information from the graph database having nodes and relationships describing the organization. Still further, the interface in the computer system sends a portion of the information to the client based on how much of the information is displayable by the client.
Abstract:
Database data is unmasked in order to facilitate its efficient handling by a database engine. In response to a request for data of a masked table including a masked element, an engine identifies a mask interval, and then performs a first join with unmasked elements sharing a common key. The table resulting from this first join is then grouped according to a highest level location of the mask. A second join is then performed between the results of this grouping and the mask interval, to produce a corresponding unmasked table including a plurality of unmasked elements corresponding to the masked element. Unmasking according to embodiments may be particularly useful in leveraging processing power of an in-memory database engine, allowing it to efficiently perform batch processing of requests for masked data received from software of an overlying application layer.
Abstract:
A graph representation is described that may be used for data extraction for a data repository. In one example, the graph representation defines an extraction dataset from an object. A selection from a user for a root node is received. Additional are presented for selection by the user based on fields and properties of the selected root node. The root node and selected additional objects are presented as a data graph. The selected objects are joined and presented in the data graph. Finally a dataset is extracted from the object-oriented database based on the data graph.
Abstract:
A method, system, and computer program product for managing upgrades of database systems using a transparently-patched seed data table. The method commences on a running system by copying (while software applications are running) portions of data comprising a seed data table to database table rows that are temporarily inaccessible by the software applications. The copy operation creates new rows (a seed data table copy) in a database table. The method continues while software applications are running by modifying the seed data table copy (e.g., by applying a patch). For a brief time, the method stops the software applications, then changes the database table rows that were temporarily inaccessible by the software applications to become accessible by the software applications and restarts the software application to point to the patched seed data table copy. The patch can add or change a column of the seed data table copy or its schema.