摘要:
The embodiments of the invention provide methods for obtaining improved text similarity measures. More specifically, a method of measuring similarity between at least two electronic documents begins by identifying similar terms between the electronic documents. This includes basing similarity between the similar terms on patterns, wherein the patterns can include word patterns, letter patterns, numeric patterns, and/or alphanumeric patterns. The identifying of the similar terms also includes identifying multiple pattern types between the electronic documents. Moreover, the basing of the similarity on patterns identifies terms within the electronic documents that are within a category of a hierarchy. Specifically, the identifying of the terms reviews a hierarchical data tree, wherein nodes of the tree represent terms within the electronic documents. Lower nodes of the tree have specific terms; and, wherein higher nodes of the tree have general terms.
摘要:
A method and system for graphically representing a plan for a query in a relational database management system is disclosed. The method includes receiving and processing an input query to form a plurality of plans, selecting at least one plan of the plurality of plans, and transforming the selected plan into a self-describing formatted file which is platform independent. The method further includes generating a graph representing the selected plan from the self-describing formatted file.
摘要:
A method and system for discovering keys in a database. A minimal set of non-keys of the database are found. The database includes at least two entities and at least two attributes. The minimal set of non-keys includes at least two non-keys. Each entity independently includes a value of each attribute. A set of keys of the database is generated from the minimal set of non-keys. Each key of the generated set of keys independently is a unitary key consisting of one attribute or a composite key consisting of at least two attributes.
摘要:
The present invention provides a method, system and program product for integrating a service external to a database into a database such that the service may be easily invoked from the database. Preferably, the service is a web service available over the internet The service may be invoked from any of a number of invoking mechanisms of the database. In a first specific embodiment, the mechanism comprises a user-defined function within an SQL statement. In a second specific embodiment, the mechanism comprises a virtual table. In a third specific embodiment, the mechanism comprises a stored procedure. In a fourth specific embodiment, the mechanism comprises a trigger. In a fifth specific embodiment, the mechanism comprises a federated table accessed via a nickname and implemented using a wrapper.
摘要:
A local database cache enabling persistent, adaptive caching of either full or partial content of a remote database is provided. Content of tables comprising a local cache database is defined on per-table basis. A table is either: defined declaratively and populated in advance of query execution, or is determined dynamically and asynchronously populated on-demand during query execution. Based on a user input query originally issued against a remote DBMS and referential cache constraints between tables in a local database cache, a Janus query plan, comprising local, remote, and probe query portions is determined. A probe query portion of a Janus query plan is executed to determine whether up-to-date results can be delivered by the execution of a local query portion against a local database cache, or whether it is necessary to retrieve results from a remote database by executing a remote query portion of Janus query plan.
摘要:
A system and method for automatically discovering topical structures of databases includes a model builder adapted to compute various kinds of representations for the database based on schema information and data values of the database. A plurality of base clusterers is also provided, one for each representation. Each base clusterer is adapted to perform, for the representation, preliminary topical clustering of tables within the database to produce a plurality of clusters, such that each of the clusters corresponds to a set of tables on the same topic. A meta-clusterer aggregates results of the clusterers into a final clustering, such that the final clustering comprises a plurality of the clusters. A representative finder identifies representative tables from the clusters in the final clustering. The representative finder identifies at least one representative table for each of the clusters in the final clustering. The representative finder also arranges the representative tables by topic as a topical directory and outputs the topical directory.
摘要:
A local database cache enabling persistent, adaptive caching of either full or partial content of a remote database is provided. Content of tables comprising a local cache database is defined on per-table basis. A table is either: defined declaratively and populated in advance of query execution, or is determined dynamically and asynchronously populated on-demand during query execution. Based on a user input query originally issued against a remote DBMS and referential cache constraints between tables in a local database cache, a Janus query plan, comprising local, remote, and probe query portions is determined. A probe query portion of a Janus query plan is executed to determine whether up-to-date results can be delivered by the execution of a local query portion against a local database cache, or whether it is necessary to retrieve results from a remote database by executing a remote query portion of Janus query plan.
摘要:
According to one embodiment of the present invention, a method for processing a query is provided. The method includes generating a set of pre-computed materialized sub-graphs from a dataset and receiving a search query having one or more search query terms. A particular one of the pre-computed materialized sub-graphs is accessed and a dynamic authority-based keyword search is executed on the particular one of the pre-computed materialized sub-graphs. Nodes in the dataset are then retrieved based on the executing, and a response to the search query is provided which includes the retrieved nodes.
摘要:
A computer-implemented method for accessing content items in a content store are described. In one embodiment, the computer-implemented method includes maintaining a text index of content items in a content store to enable a keyword search on the content items, receiving a query having a keyword and generating a hit list from the text index using the keyword, and extracting frequent phrases from text within content items of the hit list. The computer-implemented method also includes assigning a relative relevance to the frequent phrases and grouping content items into topics based on presence of relevant phrases within the content items of the hit list. The hit list includes one or more content items of the content store. The frequent phrases having a relatively high relevance are relevant phrases.
摘要:
A system, method and computer program product for executing a query on linked data sources. Embodiments of the invention generate an instance graph expressing relationships between objects in the linked data sources and receive a query including at least first and second search terms. The first search term is then executed on the instance graph and a summary graph is generated using the results of the executing step. A second search term is then executed on the summary graph.