Abstract:
One embodiment provides for a method including performing, by a processing thread, a process that analyzes transactional operations by maintaining the transactional operations in transaction local side logs, and waiting until a successful transaction commit to append the transaction local side logs to a log stream. The processing thread processes the transactional operations on a key used to determine whether existing data is found for the key. The transactional operations are sped up through parallelism based on partitioning tables across nodes handling the transactional operations. A first process is performed by a first processor that processes updates for values of a key based on updating a first start time table index using unique keys and a start time field of a row for a first appearance of each unique key from the transactional operations.
Abstract:
One embodiment provides for a method including processing transactional operations on a key used to determine whether existing data is found for that key. A first time index is updated using unique keys and a start time field of a first appearance of each key from the transactional operations. A deferred update of prior versions of the key is performed for non-recent data upon determining that recent data in the transactional operations is found for the key.
Abstract:
Embodiments of the invention relate to executing graph path queries. A database stores data entities and attributes in node tables and stores links between nodes in an edge table. Edges form a path between a source node and a target node. A source node set is generated and joined with the edge table to produce a first intermediate set. Similarly, a target node set is generated and joined with the edge table to produce a second intermediate set. A result path is generated through a joining of the first and second intermediate paths and application of a length condition.
Abstract:
Embodiments of the invention relate to sparsity-driven matrix representation. In one embodiment, a sparsity of a matrix is determined and the sparsity is compared to a threshold. Computer memory is allocated to store the matrix in a first data structure format based on the sparsity being greater than the threshold. Computer memory is allocated to store the matrix in a second data structure format based on the sparsity not being greater than the threshold.
Abstract:
Embodiments relate to subgraph-based distributed graph processing. An aspect includes receiving an input graph comprising a plurality of vertices. Another aspect includes partitioning the input graph into a plurality of subgraphs, each subgraph comprising internal vertices and boundary vertices. Another aspect includes assigning one or more respective subgraphs to each of a plurality of workers. Another aspect includes initiating processing of the plurality of subgraphs by performing a series of processing steps comprising: processing the internal vertices and boundary vertices internally within each of the subgraphs; detecting that a change was made to a boundary vertex of a first subgraph during the internal processing; and sending a message from a first worker to which the first subgraph is assigned to a second worker to which a second subgraph is assigned in response to detecting the change that was made to the boundary vertex of the first subgraph.
Abstract:
Embodiments of the invention relate to sparsity-driven matrix representation. In one embodiment, a sparsity of a matrix is determined and the sparsity is compared to a threshold. Computer memory is allocated to store the matrix in a first data structure format based on the sparsity being greater than the threshold. Computer memory is allocated to store the matrix in a second data structure format based on the sparsity not being greater than the threshold
Abstract:
One embodiment provides for a method including performing, by a processing thread, a process that analyzes transactional operations by maintaining the transactional operations in transaction local side logs, and waiting until a successful transaction commit to append the transaction local side logs to a log stream. The processing thread processes the transactional operations on a key used to determine whether existing data is found for the key. The transactional operations are sped up through parallelism based on partitioning tables across nodes handling the transactional operations. A first process is performed by a first processor that processes updates for values of a key based on updating a first start time table index using unique keys and a start time field of a row for a first appearance of each unique key from the transactional operations.
Abstract:
A computer-implemented method, according to one embodiment, includes: generating two or more sample graphs by sampling edges of a current snapshot of a dynamic graph, generating two or more partial results by executing an algorithm on the two or more sample graphs, combining the partial results into a final result, and incrementally maintaining the sample graphs. Edges included in the current snapshot of a dynamic graph and which were added to the dynamic graph in a most recent update thereto are included in each of the generated two or more sample graphs. Moreover, incrementally maintaining the sample graphs includes: subsampling each of the edges of each of the sample graphs at a given time by applying a Bernoulli trial, and combining a result of the subsampling with new edges received in a batch corresponding to the given time to form new sample graphs.
Abstract:
A computer determines social media influencers in a specific topic by receiving a dataset of information associated with a website, the information including a first list of users of the website and a list of content that each user posts on the website, wherein each user is associated with other users from the first list of users. The computer determines initial values representing variables of the dataset of information on the website, wherein the variables include one or more topics for the list of content that each user from the first list of users posts on the website. The computer performs an iteration of Gibbs Sampling utilizing the initial values. The computer determines the one or more new values representing variables of the dataset represent a distribution of the one or more topics for the list of content that each user from the first list of users posts.
Abstract:
Embodiments of the invention relate to sparsity-driven matrix representation. In one embodiment, a sparsity of a matrix is determined and the sparsity is compared to a threshold. Computer memory is allocated to store the matrix in a first data structure format based on the sparsity being greater than the threshold. Computer memory is allocated to store the matrix in a second data structure format based on the sparsity not being greater than the threshold.