Abstract:
Embodiments include a method, system, and computer program product for performing time alignments. The method includes receiving a specification request for generating a set of target time-series data from a set of source time-series data and obtaining specification information relating to the set of target time-series data and relating to the set of source time-series data. The specification also includes time intervals between data values. The method also includes converting the set of source time-series data to the set of target time-series data, wherein said converting includes calculating a set of cubic-spline interpolation constants.
Abstract:
A computer-implemented method includes partitioning a plurality of records into a plurality of splits. Each split includes at least a portion of the plurality of records. The method further includes providing at least one split of the plurality of splits to a mapper. The mapper scans the input data set, transforms each input record using a map function, and extracts a grouping key in parallel. The method further includes assigning at least a portion the records of the at least one split to a group. Each assignment to the group is based on a strata of the assigned record, and filtering the records of the group. Each filtering is based on a comparison of a weight of a record to a local threshold of the mapper. The method further includes shuffling the group to a reducer and providing a stratified sampling of the plurality of records based on the group.
Abstract:
A computer determines social media influencers in a specific topic. The computer receives a dataset of information on a website, the information including a list of users of the website and a list of content that each user posts, wherein each user is associated with one or more other users. The computer identifies a plurality of variables associated with the dataset, wherein the plurality of variables represent the information of the dataset on the website. The computer executes a topic specific search based on the plurality of variables, the topic search providing at least another list of users representing influencers in a specific topic.
Abstract:
In one general embodiment, a computer-implemented method is provided for analyzing a dynamic graph. The computer-implemented method includes generating two or more sample graphs by sampling edges of a current snapshot of a dynamic graph. Additionally, the computer-implemented method includes generating two or more partial results by executing an algorithm on the sample graphs. Still yet, the computer-implemented method includes combining the partial results, from executing the algorithm on the sample graphs, into a final result.
Abstract:
Stratified sampling of a plurality of records is performed. A plurality of records are partitioned into a plurality of splits, wherein each split includes at least a portion of the plurality of records. The split of the plurality of splits is provided to a mapper. The mapper assigns at least a portion the records of the at least one split to a group based on a strata of the assigned records, and filters the records of the group based on a comparison of the weights of the records to a local threshold of the mapper. The mapper updates the local threshold of the mapper by communicating with a coordinator. The mapper shuffles the group to a reducer, where the reducer filters the records of the group based on the weights of the records. The reducer provides a stratified sampling of the plurality of records based on the group.
Abstract:
A computer-implemented method, according to one embodiment, includes: generating two or more sample graphs by sampling edges of a current snapshot of a dynamic graph, generating two or more partial results by executing an algorithm on the two or more sample graphs, combining the partial results into a final result, and incrementally maintaining the sample graphs. Edges included in the current snapshot of a dynamic graph and which were added to the dynamic graph in a most recent update thereto are included in each of the generated two or more sample graphs. Moreover, incrementally maintaining the sample graphs includes: subsampling each of the edges of each of the sample graphs at a given time by applying a Bernoulli trial, and combining a result of the subsampling with new edges received in a batch corresponding to the given time to form new sample graphs.
Abstract:
Stratified sampling of a plurality of records is performed. A plurality of records are partitioned into a plurality of splits, wherein each split includes at least a portion of the plurality of records. The split of the plurality of splits is provided to a mapper. The mapper assigns at least a portion the records of the at least one split to a group based on a strata of the assigned records, and filters the records of the group based on a comparison of the weights of the records to a local threshold of the mapper. The mapper updates the local threshold of the mapper by communicating with a coordinator. The mapper shuffles the group to a reducer, where the reducer filters the records of the group based on the weights of the records. The reducer provides a stratified sampling of the plurality of records based on the group.
Abstract:
In one general embodiment, a computer-implemented method is provided for analyzing a dynamic graph. The computer-implemented method includes generating two or more sample graphs by sampling edges of a current snapshot of a dynamic graph. Additionally, the computer-implemented method includes generating two or more partial results by executing an algorithm on the sample graphs. Still yet, the computer-implemented method includes combining the partial results, from executing the algorithm on the sample graphs, into a final result.
Abstract:
A computer determines social media influencers in a specific topic. The computer receives a dataset of information on a website, the information including a list of users of the website and a list of content that each user posts, wherein each user is associated with one or more other users. The computer identifies a plurality of variables associated with the dataset, wherein the plurality of variables represent the information of the dataset on the website. The computer executes a topic specific search based on the plurality of variables, the topic search providing at least another list of users representing influencers in a specific topic.
Abstract:
A system for performing time conversions that includes a processor configured to generate a set of target time-series data from a set of source time-series data and a memory containing specification information relating to the set of target time-series data and also containing information relating to the set of source time-series data. The source time-series specification and the target time-series specification include time intervals between data values. The system also includes a time alignment algorithm used by the processor for converting the set of source time-series data to the set of target time-series data. The converting includes calculation of a set of cubic-spline interpolation constants and the cubic-spline constants.