Query Relaxation Using External Domain Knowledge for Query Answering

    公开(公告)号:US20210042304A1

    公开(公告)日:2021-02-11

    申请号:US16537496

    申请日:2019-08-09

    摘要: A system, method, and computer readable medium perform a method of query relaxation. A query output from a conversational system is received. At least one search term in the query is identified. Instance data is output to the conversational system in response to determining that the instance data in a data store matches the at least one search term in the query. The computer device outputs the received query to an external domain-specific knowledge source in response to determining that the at least one search term does not the match instance data in the data store. The computer device receives the relaxed data matches from the external domain-specific knowledge source being semantically-related to at least one search term in the query based on a plurality of criteria associated with the query. The computing device generates a response to the query based on contextual information and structural information.

    SPLIT ELIMINATION IN MAPREDUCE SYSTEMS
    3.
    发明申请

    公开(公告)号:US20180196828A1

    公开(公告)日:2018-07-12

    申请号:US15912410

    申请日:2018-03-05

    IPC分类号: G06F17/30

    CPC分类号: G06F16/182

    摘要: Embodiments of the present invention relate to elimination of blocks such as splits in distributed processing systems such as MapReduce systems using the Hadoop Distributed Filing System (HDFS). In one embodiment, a method of and computer program product for optimizing queries in distributed processing systems are provided. A query is received. The query includes at least one predicate. The query refers to data. The data includes a plurality of records. Each record comprises a plurality of values in a plurality of attributes. Each record is located in at least one of a plurality of blocks of a distributed file system. Each block has a unique identifier. For each block of the distributed file system, at least one value cluster is determined for an attribute of the plurality of attributes. Each value cluster has a range. The predicate of the query is compared with the at least one value cluster of each block. The query is executed against only those blocks where the predicate is met by at least one value cluster.

    DYNAMIC QUERY OPTIMIZATION WITH PILOT RUNS
    5.
    发明申请
    DYNAMIC QUERY OPTIMIZATION WITH PILOT RUNS 有权
    动态查询优化与引航

    公开(公告)号:US20150363466A1

    公开(公告)日:2015-12-17

    申请号:US14301627

    申请日:2014-06-11

    IPC分类号: G06F17/30

    摘要: In one embodiment, a computer-implemented method includes selecting one or more sub-expressions of a query during compile time. One or more pilot runs are performed by one or more computer processors. The one or more pilot runs include a pilot run associated with each of one or more of the selected sub-expressions, and each pilot run includes at least partial execution of the associated selected sub-expression. The pilot runs are performed during execution time. Statistics are collected on the one or more pilot runs during performance of the one or more pilot runs. The query is optimized based at least in part on the statistics collected during the one or more pilot runs, where the optimization includes basing cardinality and cost estimates on the statistics collected during the pilot runs.

    摘要翻译: 在一个实施例中,计算机实现的方法包括在编译期间选择查询的一个或多个子表达。 一个或多个导频运行由一个或多个计算机处理器执行。 一个或多个导频运行包括与所选择的一个或多个子表达中的每一个相关联的导频运行,并且每个导频运行包括相关联的所选子表达式的至少部分执行。 飞行员运行在执行时执行。 在一次或多次飞行员运行期间,在一次或多次飞行员运行中收集统计数据。 该查询至少部分地基于在一个或多个试运行期间收集的统计数据进行优化,其中优化包括基于试点运行期间收集的统计数据的基数和成本估算。

    HYBRID GRAPH NEURAL NETWORK
    7.
    发明申请

    公开(公告)号:US20220237447A1

    公开(公告)日:2022-07-28

    申请号:US17161152

    申请日:2021-01-28

    IPC分类号: G06N3/08 G06F16/28 G06N3/04

    摘要: An embodiment includes extracting, responsive to an update request from a remote requesting system, technical descriptor data from a data source. The embodiment also includes forming a new graph data structure using the technical descriptor data extracted from the data source. The embodiment also includes augmenting the new graph data structure to include a concept based on a value from instance data from the data source. The embodiment also includes identifying a first pair of concepts that are connected in a pre-existing ontology that correspond with a second pair of concepts that lack a connection therebetween in the new graph structure. The embodiment also includes augmenting the new graph data structure to include a connection between the second pair of concepts. The embodiment also includes outputting the new graph data structure as part of a response to the update request from the requesting system.

    HYBRID GRAPH NEURAL NETWORK
    8.
    发明申请

    公开(公告)号:US20220237446A1

    公开(公告)日:2022-07-28

    申请号:US17161093

    申请日:2021-01-28

    IPC分类号: G06N3/08 G06F16/28 G06N3/04

    摘要: An embodiment includes generating a first concept representation based on a first portion of a knowledge base using a first processing path of a neural network that includes a hyperbolic graph convolution layer. The embodiment also includes generating a second concept representation based on a second portion of the knowledge base using a second processing path of the neural network, the second processing path comprising a heterogenous graph convolution layer. The embodiment also includes generating a unified concept representation including concatenating the first concept representation with the second concept representation. The embodiment also includes generating a prediction score using a using a predictive matching module, where the predictive score is indicative of an extent of a match between the unified concept representation and a concept representation from a second knowledge base.

    Assignment of data within file systems

    公开(公告)号:US10127237B2

    公开(公告)日:2018-11-13

    申请号:US14974477

    申请日:2015-12-18

    IPC分类号: G06F12/00 G06F17/30

    摘要: The embodiments relate to assigning data to processors of a file system. Metadata associated with respective blocks of data, and an initial batch of the blocks is assigned to nodes of a file system based on the metadata. Unassigned blocks are selectively assigned to one or more of the nodes. The selective assignment includes constructing a linear regression model based on node data, and determining a value for each node based on the linear regression model. Each value is associated with a predicted load corresponding to a new assignment of one or more unassigned blocks.