摘要:
Described is a search technology in which a search engine constructs a results page for a query that integrates suggested queries with the individual query results (e.g., displayed URLs). When rendered, the proximity of the suggested queries to their corresponding individual query result provides context as to the specific URL to which the suggested query is related. Suggested queries may appear alongside their associated search result, e.g., a displayed URL, and/or in an expandable panel proximate that individual search result. Suggested queries may appear within text accompanying a URL, and/or in a drop down menu following interaction with such text or the like. Related queries may be found by using a search result URL to find a query, by analyzing a search result's text snippet, by accessing historical data, and/or by accessing current user session data.
摘要:
A method of inputting text is provided in which a first portion of an input string is received from a user, the first portion of the input string including at least one keystroke representing a wildcard character of the input string. A second portion of the input string is then received, with the second portion including one or more keystrokes all representing non-wildcard characters of the input string.
摘要:
An information retrieval method wherein users may submit a query via a graphical bitmapping technique. The user provides an information retrieval system with a bitmap of a printed, written, or graphical query by either scanning the query with a graphical scanner, or employing a standard facsimile transmission machine. The information retrieval system then performs an optical image/character recognition process upon the received bitmap to determine the content of the query, information is then retried based upon the recognized characters and images. In a particular method of the invention, the user is provided with a bitmap of the retrieved information.
摘要:
An improved data compression method and apparatus is provided, particularly with regard to the compression of data in tabular form such as database records. The present invention achieves improved compression ratios by utilizing metadata to transform the data in a manner that optimizes known compression techniques. In one embodiment of the invention, a schema is generated which is utilized to reorder and partition the data into low entropy and high entropy portions which are separately compressed by conventional compression methods. The high entropy portion is further reordered and partitioned to take advantage of row and column dependencies in the data. The present invention enables not only much greater compression ratios but increased speed than is achieved by compressing the untransformed data.
摘要:
Systems and methods that enhance estimate(s) of features (e.g., word associations), via employing a sampling component (e.g., sketches) that facilitates computations of sample contingency tables, and designates occurrences (or absence) of features in data (e.g., words in document lists). The sampling component can further include a contingency table generator and an estimation that employs a likelihood argument (e.g., partial likelihood, maximum likelihood, and the like) to estimate features/word pair(s) associations in the contingency tables.
摘要:
Described is a search technology in which a search engine constructs a results page for a query that integrates suggested queries with the individual query results (e.g., displayed URLs). When rendered, the proximity of the suggested queries to their corresponding individual query result provides context as to the specific URL to which the suggested query is related. Suggested queries may appear alongside their associated search result, e.g., a displayed URL, and/or in an expandable panel proximate that individual search result. Suggested queries may appear within text accompanying a URL, and/or in a drop down menu following interaction with such text or the like. Related queries may be found by using a search result URL to find a query, by analyzing a search result's text snippet, by accessing historical data, and/or by accessing current user session data.
摘要:
Query logs are accessed to obtain queries, user information that specifies a user from which the query was received, a long with a selected result that was selected by the specified user who authored the query. This query log information is used to identify classes of users that looked for a similar result given a similar query. Those classes can then be used by a search engine in order to rank or provide search results to a user in response to a query input by the user.
摘要:
An improved data compression method and apparatus is provided, particularly with regard to the compression of data in tabular form such as database records. The present invention achieves improved compression ratios by utilizing metadata to transform the data in a manner that optimizes known compression techniques. In one embodiment of the invention, a schema is generated which is utilized to reorder and partition the data into low entropy and high entropy portions which are separately compressed by conventional compression methods. The high entropy portion is further reordered and partitioned to take advantage of row and column dependencies in the data. The present invention enables not only much greater compression ratios but increased speed than is achieved by compressing the untransformed data.
摘要:
Interactive Methods and apparatus for studying similarities of values in very large data sets. The methods and apparatus employ a dotplot in an interactive graphical user interface to make the relationship between the similarities and the data set visible. A variety of filtering, weighting, and compression techniques make it possible to employ the dot plot with sequences of more than 10,000 tokens and to interactively magnify the dot plot, change weighting and display quantization, and view the underlying data. Also disclosed is a technique which is employed in the apparatus for identifying long sequences of similar tokens. The apparatus is used in the study of large bodies of text and code.
摘要:
A glossary construction tool for generating and maintaining a translation glossary, consisting of a number of terms and their translations. The glossary construction tool includes a terminology list development tool for generating a terminology list in the source language and a glossary development tool for automatically obtaining candidate translations for the terms in the terminology list. The terminology list development tool will construct the terminology list in the source language by analyzing the source text document to be translated and automatically extracting a list of candidate terms, comprised of multiple word noun phrases and single words not appearing on a standard or predefined stop list of "noise" words. The glossary development tool will obtain candidate translations for terms in the final terminology list by searching the source text document of a word-aligned text pair for a term to be translated and then provide candidate translations based on the indicated alignment with the target text document of the aligned text pair. A concordance tool provides monolingual and bilingual concordances in order to facilitate the user's evaluation of the automatically generated list of candidate terms and candidate translations, respectively.