Abstract:
A system that employs an explicitly and/or implicitly trained model in order to return entity-specific computer-based search results is provided. The innovation can provide for a customized search model that focuses search in connection with achieving information that is meaningful with respect to goals of an entity. The model can be used to modify a search query in accordance with a goal of the entity or to generate the search query thereby returning meaningful and/or targeted results to the user. The system can automatically gather entity-related data thereafter determining or inferring a goal as well as training the model. Moreover, the system can selectively configure (e.g., order, rank, filter) and render results to a user based upon the model.
Abstract:
The present invention leverages relevance data to provide enhanced search query results based on relevancy to a specific entity via an entity-specific tunable search. This allows an entity to retrieve information that is of more value to that entity, in a faster and more efficient manner. The entity itself can be an individual user, a grouping of users, and/or an enterprise and the like. In one instance of the present invention, entity-specific relevance information is determined via employment of similarity of the entity to another entity or group of entities. Interest levels and/or satisfaction levels of similar entities can also be utilized along with similarity information to facilitate in deriving the relevance information.
Abstract:
A system for associating information comprises an association module that uses anchoring information to associate a first piece of information with a second piece of information, wherein the second piece of information is not part of the first piece of information. The system further includes a rendering module that presents the second piece of information for use. Methods for using such a system are also described.
Abstract:
The present invention provides systems and methods for obtaining information from a networked system utilizing a distributed web crawler. The distributed nature of clients of a server is leveraged to provide fast and accurate web crawling data. Information gathered by a server's web crawler is compared to data retrieved by clients of the server to update the crawler's data. In one instance of the present invention, data comparison is achieved by utilizing information disseminated via a search engine results page. In another instance of the present invention, data validation is accomplished by client dictionaries, emanating from a server, that summarize web crawler data. The present invention also facilitates data analysis by providing a means to resist spoofing of a web crawler to increase data accuracy.
Abstract:
The relevancy of search results are improved by exploiting changes in data related to information access. Parameter varying aspects of parameter varying data associated with document access are leveraged to provide enhanced ranking of document. As an aspect of the parameter varies, a rank can be accomplished, producing multiple ranks for a given set of parameter varying data. Parameters such as time, user preferences, popularity, and/or user demographics and the like can be utilized as parameter varying data. Thus, in general, single or multiple varying aspects of the parameters can be employed to produce a set of ranks comprising one or more rankings of document. This technique can be employed with static rankers, dynamic rankers, and/or ranker training data and the like to produce higher relevancy search results, increasing user satisfaction.
Abstract:
A method of training a natural language processing unit applies a candidate learning set to at least one component of the natural language unit. The natural language unit is then used to generate a meaning set from a first corpus. A second meaning set is generated from a second corpus using a second natural language unit and the two meaning sets are compared to each other to form a score for the candidate learning set. This score is used to determine whether to modify the natural language unit based on the candidate learning set.
Abstract:
a system 100 that facilitates determining a user's intent given a user search query comprises a search engine that is employed to search over a collection of objects within a data store to retrieve a user search result set. The objects within the result set are associated with queries that were previously utilized to locate such objects. A level of relatedness between the previous queries and the user search query is determined, and previous queries that are associated with a result set that is novel and related to the user search result set are returned to the user.
Abstract:
Architecture for improving text searches using information redundancy. A search component is coupled with an analysis component to rerank documents returned in a search according to a redundancy values. Each returned document is used to develop a corresponding word probability distribution that is further used to rerank the returned documents according to the associated redundancy values. In another aspect thereof, the query component is coupled with a projection component to project answer redundancy from one document search to another. This includes obtaining the benefit of considerable answer redundancy from a second data source by projecting the success of the search of the second data source against a first data source.
Abstract:
The present invention relates to a system and methodology to facilitate automated error correction of user input data via an analysis of the input data in accordance with an automatically generated and filtered database of processed structural groupings or formulations selected and filtered from past user activities. The filtered database provides a relevant foundation of potential phrases, topics, symbols, speech and/or colloquial structures of interest to users—which are automatically determined from previous user activity, and employed to facilitate automated error checking in accordance with the user's current input, command and/or request for information.
Abstract:
A spell checker based on the noisy channel model has a source model and an error model. The source model determines how likely a word w in a dictionary is to have been generated. The error model determines how likely the word w was to have been incorrectly entered as the string s (e.g., mistyped or incorrectly interpreted by a speech recognition system) according to the probabilities of string-to-string edits. The string-to-string edits allow conversion of one arbitrary length character sequence to another arbitrary length character sequence.