Join with predictive granularity modification by example

    公开(公告)号:US10394815B2

    公开(公告)日:2019-08-27

    申请号:US15299388

    申请日:2016-10-20

    摘要: A computing device is provided, comprising a processor configured to select at least one pair of columns. Each pair may include a source column of the first table and a target column of the second table. For each pair, the processor may detect that the columns contain data with different granularities. The processor may modify the data to have the same granularity, and may generate an example including an element from the source column and an element from the target column. For each example, the processor may programmatically generate a script that, when performed on the source column, produces a value consistent with the target column. For the script with output that meets a matching criterion, the processor may convey the output for display, and may, in response to a signal accepting the script, join the tables at least in part by performing the script on the source column.

    JOIN WITH FORMAT MODIFICATION BY EXAMPLE
    5.
    发明申请

    公开(公告)号:US20180113848A1

    公开(公告)日:2018-04-26

    申请号:US15299363

    申请日:2016-10-20

    IPC分类号: G06F17/24 G06F17/21 G06K9/00

    摘要: A computing device is provided comprising a processor configured to select at least one pair of elements, including an element in a source column of the first table and an element in a target column of the second table. The processor may detect that the elements are in different formats. For at least one element, the processor may apply a predetermined mapping to a common format. The processor may modify at least one element to have the same format as the other, and may generate an example including the modified pair. The processor may programmatically generate a script that, when performed on the selected elements, produces a value consistent with the example. For the script with output matching the elements of the target column, the processor may convey the output for display, and may join the tables at least in part by performing the script on the source column.

    Syntactic profiling of alphanumeric strings

    公开(公告)号:US11210327B2

    公开(公告)日:2021-12-28

    申请号:US16448805

    申请日:2019-06-21

    摘要: A computing device includes a storage machine holding instructions executable by a logic machine to generate multi-string clusters, each containing alphanumeric strings of a dataset. Further multi-string clusters are generated via iterative performance of a combination operation in which a hierarchically-superior cluster is generated from a set of multi-string clusters. The combination operation includes, for candidate pairs of multi-string clusters, generating syntactic profiles describing an alphanumeric string from each multi-string cluster of the candidate pair. For each of the candidate pairs, a cost factor is determined for at least one of its syntactic profiles. Based on the cost factors determined for the syntactic profiles, one of the candidate pairs is selected. The multi-string clusters from the selected candidate pair are combined to generate the hierarchically-superior cluster including all of the alphanumeric strings from the selected candidate pair of multi-string clusters.

    Join with predictive merging of multiple columns

    公开(公告)号:US10585888B2

    公开(公告)日:2020-03-10

    申请号:US15299404

    申请日:2016-10-20

    摘要: A computing device is provided, comprising a processor configured to select at least one pair of tuples of columns including a source tuple from a first table and a target tuple from a second table. For each pair, the processor may select one or more rows from the source tuple and elements of the target tuple. For each selected row, the processor may programmatically generate a script that, when performed on the source tuple, produces a value consistent with the target tuple. The processor may apply each script to other rows of the source tuple and determine that an output is in the target tuple. For each column of the target tuple, for the script with output that meets a matching criterion, the processor may convey the output and, in response to a signal accepting the script, join the tables at least in part by performing each accepted script.

    Syntactic profiling of alphanumeric strings

    公开(公告)号:US10394874B2

    公开(公告)日:2019-08-27

    申请号:US15663575

    申请日:2017-07-28

    IPC分类号: G06F16/35 G06F17/27 G06F17/22

    摘要: A computing device includes a storage machine holding instructions executable by a logic machine to generate multi-string clusters, each containing alphanumeric strings of a dataset. Further multi-string clusters are generated via iterative performance of a combination operation in which a hierarchically-superior cluster is generated from a set of multi-string clusters. The combination operation includes, for candidate pairs of multi-string clusters, generating syntactic profiles describing an alphanumeric string from each multi-string cluster of the candidate pair. For each of the candidate pairs, a cost factor is determined for at least one of its syntactic profiles. Based on the cost factors determined for the syntactic profiles, one of the candidate pairs is selected. The multi-string clusters from the selected candidate pair are combined to generate the hierarchically-superior cluster including all of the alphanumeric strings from the selected candidate pair of multi-string clusters.

    Join with format modification by example

    公开(公告)号:US10546055B2

    公开(公告)日:2020-01-28

    申请号:US15299363

    申请日:2016-10-20

    IPC分类号: G06F17/24 G06F17/21 G06K9/00

    摘要: A computing device is provided comprising a processor configured to select at least one pair of elements, including an element in a source column of the first table and an element in a target column of the second table. The processor may detect that the elements are in different formats. For at least one element, the processor may apply a predetermined mapping to a common format. The processor may modify at least one element to have the same format as the other, and may generate an example including the modified pair. The processor may programmatically generate a script that, when performed on the selected elements, produces a value consistent with the example. For the script with output matching the elements of the target column, the processor may convey the output for display, and may join the tables at least in part by performing the script on the source column.