摘要:
Techniques for category-based content recommendation are described. Some embodiments provide a content recommendation system (“CRS”) configured to recommend content items (e.g., Web pages, images, videos) that are related to specified categories. In one embodiment, the CRS processes content items to determine entities referenced by the content items, and to determine categories related to the referenced entities. The determined entities and/or categories may be part of a taxonomy that is stored by the CRS. Then, in response to a received request that indicates a category, the CRS determines and provides indications of one or more content items that each have a corresponding category that matches the indicated category. In some embodiments, at least some of these techniques are employed to implement a category-based news service.
摘要:
Techniques for category-based content recommendation are described. Some embodiments provide a content recommendation system (“CRS”) configured to recommend content items (e.g., Web pages, images, videos) that are related to specified categories. In one embodiment, the CRS processes content items to determine entities referenced by the content items, and to determine categories related to the referenced entities. The determined entities and/or categories may be part of a taxonomy that is stored by the CRS. Then, in response to a received request that indicates a category, the CRS determines and provides indications of one or more content items that each have a corresponding category that matches the indicated category. In some embodiments, at least some of these techniques are employed to implement a category-based news service.
摘要:
Methods and systems for extending keyword searching techniques to syntactically and semantically annotated data are provided. Example embodiments provide a Syntactic Query Engine (“SQE”) that parses, indexes, and stores a data set as an enhanced document index with document terms as well as information pertaining to the grammatical roles of the terms and ontological and other semantic information. In one embodiment, the enhanced document index is a form of term-clause index, that indexes terms and syntactic and semantic annotations at the clause level. The enhanced document index permits the use of a traditional keyword search engine to process relationship queries as well as to process standard document level keyword searches. In one embodiment, the SQE comprises a Query Processor, a Data Set Preprocessor, a Keyword Search Engine, a Data Set Indexer, an Enhanced Natural Language Parser (“ENLP”), a data set repository, and, in some embodiments, a user interface or an application programming interface.
摘要:
Methods and systems for syntactically indexing and searching data sets to achieve more accurate search results are provided. Example embodiments provide a Syntactic Query Engine (“SQE”) that parses, indexes, and stores a data set, as well as processes natural language queries subsequently submitted against the data set. The SQE comprises a Query Preprocessor, a Data Set Preprocessor, a Query Builder, a Data Set Indexer, an Enhanced Natural Language Parser (“ENLP”), a data set repository, and, in some embodiments, a user interface. After preprocessing the data set, the SQE parses the data set and determines the syntactic and grammatical roles of each term to generate enhanced data representations for each object in the data set. The SQE indexes and stores these enhanced data representations in the data set repository. Upon subsequently receiving a query, the SQE parses the query similarly and searches the indexed stored data set to locate data that contains similar terms used in similar grammatical roles. In this manner, the SQE is able to achieve more contextually accurate search results more frequently than using traditional search engines.
摘要:
Techniques for content recommendation are described. Some embodiments provide a content recommendation system (“CRS”) configured to recommend content items that are related to a collection of entities. A content item may be considered related to a collection of entities based on various factors, including whether and how often the article references or otherwise covers the entities of the collection, the size of the article, other entities that are covered by the article but that are not in the collection, article recency, or article credibility. Recommending content items may also or instead include determining entities that are related to a collection. An entity can be considered related to a collection based on various factors, such as whether the entity is of the same or similar type to entities of the collection, or whether the entity appears in some article in a relationship with one or more entities of the collection.
摘要:
Methods, systems, and techniques for cluster-based content recommendation are described. Some embodiments provide a content recommendation system (“CRS”) configured to recommend news stories about events or occurrences. In some embodiments, a news story about an event includes multiple related content items that each include an account of the event and that each reference one or more entities or categories that are represented by the CRS. In one embodiment, the CRS identifies news stories by generating clusters of related content items. Then, in response to a received query that indicates a keyterm, entity, or category, the CRS determines and provides indications of one or more news stories that are relevant to the received query. In some embodiments, at least some of these techniques are employed to implement a news story recommendation facility in an online news service.
摘要:
Methods, techniques, and systems for using natural language processing to recommend related content to an associated text segment or document. Example embodiments provide a NLP-based content recommender (“NCR”) which uses NLP-based search techniques, potentially in conjunction with context or other related information, to locate and provide content related to entities that are recognized in the associated material. NCRs may be embedded as widgets, for example on Web pages to assist users in their perusal and search for information, provided by means of browser plug-ins or other application plug-ins, provided in libraries or in standalone environments, or otherwise integrated into other code, programs, or devices. This abstract is provided to comply with rules requiring an abstract, and it is submitted with the intention that it will not be used to interpret or limit the scope or meaning of the claims.
摘要:
Enhanced computer- and network-based methods, systems, techniques are provided for retrieving more accurate and responsive search results when searching content for a designated entity using an off-the-shelf keyword-based search engine. For example, the embodiments described herein may be used to improve search results by eliminating off-topic results when presenting queries to an existing keyword-based search engine invoked by means of an API from an intermediating application. Example embodiments provide a Keyword-Based Search Enhancement System (“KBSES”), which enables intermediating applications to obtain information more closely related to user queries by enhancing such queries, on behalf of the user, with disambiguating information when deemed necessary. Based upon a variety of rules and heuristics, which can be modified as well, the KBSES determines whether an entity name in a user's query should be enhanced with additional disambiguating information, and to what extent, to prevent the retrieval of off-topic results.
摘要:
Techniques for providing quotations obtained from text documents using natural language processing techniques are described. Some embodiments provide a content recommendation system (“CRS”) configured to provide quotations by extracting quotations from a corpus text documents, and providing access to the extracted quotations in response to search requests received from users. The CRS may extract quotations by using natural language processing-based techniques to identify one or more entities, such as people, places, objects, concepts, or the like, that are referenced by the extracted quotations. The CRS may then store the extracted quotations along with identified entities, such as quotation speakers and subjects, for later access via search requests.
摘要:
Enhanced computer- and network-based methods, systems, techniques are provided for retrieving more accurate and responsive search results when searching content for a designated entity using an off-the-shelf keyword-based search engine. For example, the embodiments described herein may be used to improve search results by eliminating off-topic results when presenting queries to an existing keyword-based search engine invoked by means of an API from an intermediating application. Example embodiments provide a Keyword-Based Search Enhancement System (“KBSES”), which enables intermediating applications to obtain information more closely related to user queries by enhancing such queries, on behalf of the user, with disambiguating information when deemed necessary. Based upon a variety of rules and heuristics, which can be modified as well, the KBSES determines whether an entity name in a user's query should be enhanced with additional disambiguating information, and to what extent, to prevent the retrieval of off-topic results.