-
公开(公告)号:US20210042304A1
公开(公告)日:2021-02-11
申请号:US16537496
申请日:2019-08-09
发明人: Chuan Lei , Fatma Ozcan , Dorian Boris Miller , Jeffrey Kreulen , Rebecca Geis
IPC分类号: G06F16/2458 , G06F16/2452 , G06F16/22 , G06N5/02 , G06F16/23
摘要: A system, method, and computer readable medium perform a method of query relaxation. A query output from a conversational system is received. At least one search term in the query is identified. Instance data is output to the conversational system in response to determining that the instance data in a data store matches the at least one search term in the query. The computer device outputs the received query to an external domain-specific knowledge source in response to determining that the at least one search term does not the match instance data in the data store. The computer device receives the relaxed data matches from the external domain-specific knowledge source being semantically-related to at least one search term in the query based on a plurality of criteria associated with the query. The computing device generates a response to the query based on contextual information and structural information.
-
公开(公告)号:US20180225215A1
公开(公告)日:2018-08-09
申请号:US15425808
申请日:2017-02-06
IPC分类号: G06F12/0846
CPC分类号: G06F12/0848 , G06F2212/1041 , G06F2212/282
摘要: A computer-implemented method according to one embodiment includes receiving a request for data, locating the data at one or more partitions of a heterogeneously partitioned table, determining an access method associated with each of the one or more partitions, and requesting the data from the one or more partitions, utilizing the access method associated with each of the one or more partitions.
-
公开(公告)号:US20180196828A1
公开(公告)日:2018-07-12
申请号:US15912410
申请日:2018-03-05
发明人: Mohamed Eltabakh , Peter J. Haas , Fatma Ozcan , Mir Hamid Pirahesh , John (Yannis) Sismanis , Jan Vondrak
IPC分类号: G06F17/30
CPC分类号: G06F16/182
摘要: Embodiments of the present invention relate to elimination of blocks such as splits in distributed processing systems such as MapReduce systems using the Hadoop Distributed Filing System (HDFS). In one embodiment, a method of and computer program product for optimizing queries in distributed processing systems are provided. A query is received. The query includes at least one predicate. The query refers to data. The data includes a plurality of records. Each record comprises a plurality of values in a plurality of attributes. Each record is located in at least one of a plurality of blocks of a distributed file system. Each block has a unique identifier. For each block of the distributed file system, at least one value cluster is determined for an attribute of the plurality of attributes. Each value cluster has a range. The predicate of the query is compared with the at least one value cluster of each block. The query is executed against only those blocks where the predicate is met by at least one value cluster.
-
公开(公告)号:US09774682B2
公开(公告)日:2017-09-26
申请号:US14591951
申请日:2015-01-08
IPC分类号: H04L29/08 , H04L12/879 , H04W12/06
CPC分类号: H04L67/1097 , H04L49/901 , H04L63/08 , H04L67/10 , H04L67/141 , H04W12/06
摘要: Embodiments relate to parallel data streaming between a first computer system and a second computer system. Aspects include transmitting a request to establish an authenticated connection between a processing job on the first computer system and a process on the second computer system and transmitting a query to the process on the second computer system over the authenticated connection. Aspects further include creating one or more tasks on the first computer system configured to receive data from the second computer system in parallel and reading data received by the one or more tasks by the processing job on the first computer system.
-
公开(公告)号:US20150363466A1
公开(公告)日:2015-12-17
申请号:US14301627
申请日:2014-06-11
发明人: Andrey Balmin , Vuk Ercegovac , Jesse E. Jackson , Konstantinos Karanasos , Marcel Kutsch , Fatma Ozcan , Chunyang Xia
IPC分类号: G06F17/30
CPC分类号: G06F17/30466 , G06F17/30451 , G06F17/30469
摘要: In one embodiment, a computer-implemented method includes selecting one or more sub-expressions of a query during compile time. One or more pilot runs are performed by one or more computer processors. The one or more pilot runs include a pilot run associated with each of one or more of the selected sub-expressions, and each pilot run includes at least partial execution of the associated selected sub-expression. The pilot runs are performed during execution time. Statistics are collected on the one or more pilot runs during performance of the one or more pilot runs. The query is optimized based at least in part on the statistics collected during the one or more pilot runs, where the optimization includes basing cardinality and cost estimates on the statistics collected during the pilot runs.
摘要翻译: 在一个实施例中,计算机实现的方法包括在编译期间选择查询的一个或多个子表达。 一个或多个导频运行由一个或多个计算机处理器执行。 一个或多个导频运行包括与所选择的一个或多个子表达中的每一个相关联的导频运行,并且每个导频运行包括相关联的所选子表达式的至少部分执行。 飞行员运行在执行时执行。 在一次或多次飞行员运行期间,在一次或多次飞行员运行中收集统计数据。 该查询至少部分地基于在一个或多个试运行期间收集的统计数据进行优化,其中优化包括基于试点运行期间收集的统计数据的基数和成本估算。
-
公开(公告)号:US12019995B2
公开(公告)日:2024-06-25
申请号:US16920693
申请日:2020-07-04
CPC分类号: G06F40/40 , G06F40/279 , G06N20/00 , G06Q40/00 , G16H50/50
摘要: A computer-implemented method for generating an ontology-driven conversational interface includes generating an ontology from a description of a domain schema of a Data Analysis (DA) model, in which the DA model is a defined in terms of quantifiable, qualifying or categorical entities and their relationships as described by the domain schema. Conversational artifacts of a conversation space including a conversational pattern framework are generated by extracting DA-related intents, entities, and a dialog from the generated ontology for the conversational interface. A dialog logic table maps DA-related patterns to intents, extracted quantifiable, qualifying or categorical attributes to entities, and the dialog to user-prompts for one or more parameters in an identified DA pattern. The conversation space is integrated with at least one of an external data source or an analytics platform that stores and processes data.
-
公开(公告)号:US20220237447A1
公开(公告)日:2022-07-28
申请号:US17161152
申请日:2021-01-28
发明人: Chuan Lei , Junheng Hao , Vasilis Efthymiou , Fatma Ozcan , Abdul Quamar
摘要: An embodiment includes extracting, responsive to an update request from a remote requesting system, technical descriptor data from a data source. The embodiment also includes forming a new graph data structure using the technical descriptor data extracted from the data source. The embodiment also includes augmenting the new graph data structure to include a concept based on a value from instance data from the data source. The embodiment also includes identifying a first pair of concepts that are connected in a pre-existing ontology that correspond with a second pair of concepts that lack a connection therebetween in the new graph structure. The embodiment also includes augmenting the new graph data structure to include a connection between the second pair of concepts. The embodiment also includes outputting the new graph data structure as part of a response to the update request from the requesting system.
-
公开(公告)号:US20220237446A1
公开(公告)日:2022-07-28
申请号:US17161093
申请日:2021-01-28
发明人: Chuan Lei , Junheng Hao , Vasilis Efthymiou , Fatma Ozcan , Abdul Quamar
摘要: An embodiment includes generating a first concept representation based on a first portion of a knowledge base using a first processing path of a neural network that includes a hyperbolic graph convolution layer. The embodiment also includes generating a second concept representation based on a second portion of the knowledge base using a second processing path of the neural network, the second processing path comprising a heterogenous graph convolution layer. The embodiment also includes generating a unified concept representation including concatenating the first concept representation with the second concept representation. The embodiment also includes generating a prediction score using a using a predictive matching module, where the predictive score is indicative of an extent of a match between the unified concept representation and a concept representation from a second knowledge base.
-
公开(公告)号:US11321534B2
公开(公告)日:2022-05-03
申请号:US16815476
申请日:2020-03-11
IPC分类号: G06F17/00 , G06F40/30 , G06N5/04 , G06F40/295 , G06F16/23 , G06F16/242
摘要: A method is provided to implement a conversational system with artifact generation. A middleware component receives a user input and determines whether there is sufficient information in the user input and a conversation space in a context storage of the conversational system to identify user intent associated with the user input. Responsive to the middleware component determining there is not sufficient information to identify user intent, a communications handler component sends a natural language query to an external data source via a natural language query (NLQ) interface and receives a natural language response from the external data source. The middleware component updates the conversation space based on the natural language response and returns a user response based on the natural language response.
-
公开(公告)号:US10127237B2
公开(公告)日:2018-11-13
申请号:US14974477
申请日:2015-12-18
摘要: The embodiments relate to assigning data to processors of a file system. Metadata associated with respective blocks of data, and an initial batch of the blocks is assigned to nodes of a file system based on the metadata. Unassigned blocks are selectively assigned to one or more of the nodes. The selective assignment includes constructing a linear regression model based on node data, and determining a value for each node based on the linear regression model. Each value is associated with a predicted load corresponding to a new assignment of one or more unassigned blocks.
-
-
-
-
-
-
-
-
-