Abstract:
The present disclosure is related to a method and an apparatus of mining synonymous phrases. The method comprises: obtaining, according to a parallel text corpus, a first phrase-alignment relationship from phrases of a current language to phrases of an intermediate language, and a second phrase-alignment relationship from the phrases of the intermediate language to the phrases of the current language; obtaining, for a target phrase of current language, a first set of aligned phrases of the intermediate language that are aligned with the target phrase of the current language based on the first phrase-alignment relationship; obtaining a second set of aligned phrases of the current language that are aligned with selected phrase(s) in the first set of aligned phrases based on the second phrase-alignment relationship; and obtaining synonymous phrases for the target phrase from the second set of aligned phrases.
Abstract:
Acquiring dynamic data is disclosed including extracting a search term from a search request string that is received, looking up the search term in a threshold value dictionary to acquire a dynamic threshold score corresponding to the search term, using the search term as a query condition and the dynamic threshold score corresponding to the search term as a filter condition to acquire, in an index data table, one or more corresponding pieces of index information, acquiring data information corresponding to the search term based on the index information in the index data table, and sending the data information to be displayed in a page of a website. The dynamic threshold score varies based on a characteristic factor.
Abstract:
Searching information includes: receiving current query data from a client; extracting generic attribute features of the current query data, wherein the generic attribute features are used for calculating a plurality of confidence degrees of the current query data that correspond to a plurality of categories, each of the confidence degrees indicating a degree of confidence that the current query data belongs to a respective one of the plurality of categories; determining the plurality of confidence degrees of the current query data based at least in part on the generic attribute features; selecting a category based at least in part on the plurality of confidence degrees, the selected category being one of the plurality of categories and having a confidence degree higher than a confidence degree of another category; searching in the selected category for a search result that corresponds to the current query data; and returning the search result.
Abstract:
Ordering search results may include obtaining an exposed log file from a log system, computing a Bayesian posterior probability for relevancy between the log file and a search request, computing an expected value of the relevancy between the log file and the search request based on the Bayesian posterior probability, storing the search request and an identifier of the log file as a key and the expected value of the relevancy between the log file and the search request as a value into a search data structure, in response to receiving a search request submitted by a user, finding expected values of relevancy between the submitted search request and log files that are relevant to the submitted search request from the search data structure, and ordering the found log files in a descending order of the expected values.
Abstract:
Embodiments of the present application relate to a dynamic data acquisition method, a dynamic data acquisition system, and a computer program product for dynamically acquiring data. A dynamic data acquisition method is provided. The method includes extracting a search term from a search request string that is received, looking up the search term in a threshold value dictionary to acquire a dynamic threshold score corresponding to the search term, using the search term as a query condition and the dynamic threshold score corresponding to the search term as a filter condition to acquire, in an index data table, one or more corresponding pieces of index information, acquiring data information corresponding to the search term based on the index information in the index data table, and sending the data information to be displayed in a page of a website. The dynamic threshold score varies based on a characteristic factor.
Abstract:
Searching information includes: receiving current query data from a client; extracting generic attribute features of the current query data, wherein the generic attribute features are used for calculating a plurality of confidence degrees of the current query data that correspond to a plurality of categories, each of the confidence degrees indicating a degree of confidence that the current query data belongs to a respective one of the plurality of categories; determining the plurality of confidence degrees of the current query data based at least in part on the generic attribute features; selecting a category based at least in part on the plurality of confidence degrees, the selected category being one of the plurality of categories and having a confidence degree higher than a confidence degree of another category; searching in the selected category for a search result that corresponds to the current query data; and returning the search result.
Abstract:
Candidate promotion keywords are selected. Features of the candidate promotion keywords are extracted. The features include at least one of a search engine feature, an effect feature of non-directed traffic, and a text feature. The features of the candidate promotion keywords are used as input data of a pre-established keyword screening model, and superior promotion keywords are obtained according to a prediction result of the keyword screening model.
Abstract:
Searching information includes: receiving current query data from a client; extracting generic attribute features of the current query data, wherein the generic attribute features are used for calculating a plurality of confidence degrees of the current query data that correspond to a plurality of categories, each of the confidence degrees indicating a degree of confidence that the current query data belongs to a respective one of the plurality of categories; determining the plurality of confidence degrees of the current query data based at least in part on the generic attribute features; selecting a category based at least in part on the plurality of confidence degrees, the selected category being one of the plurality of categories and having a confidence degree higher than a confidence degree of another category; searching in the selected category for a search result that corresponds to the current query data; and returning the search result.
Abstract:
Methods and systems for establishing a click-through rate estimation model. A computing device may extract basic characteristics corresponding to a current language channel associated with a server provider. The computing device may combine the basic characteristics to obtain a combination characteristic. The computing device may further obtain an effective high-order characteristic based on the basic characteristics and the combination characteristic and calculate a weight of the effective high-order characteristic. The computing device may generate the CTR estimation model by applying a CTR equation to the weight corresponding to effective high-order characteristic. The implementations may not be limited by human factors, therefore achieving high efficiency in establishing CTR estimation models and high accuracy of the CTR estimation model.
Abstract:
A method and an apparatus of matching an object to be displayed are disclosed. The method includes obtaining a plurality of search keywords and released product information and grouping each of the plurality of search keywords with the released product information to form a plurality of search keyword and released product information pairs, with each search keyword and released product information pair comprising a respective search keyword and the released product information; determining and matching a plurality of features for the plurality of search keyword and released product information pairs according to a constructed first decision tree; and determining respective correlation classes of the plurality of search keyword and released product information pairs based at least in part on a result of determining and matching of the plurality of features. The disclosed method and apparatus are able to accurately and conveniently determine a matching degree between a search keyword and released product information.