摘要:
A method may include obtaining, based on a content of a search query, one or more documents in a first language; identifying one or more documents in a second language that contain an anchor that links to the one or more documents in the first language, the second language being different than the first language; and translating one or more terms of the search query into the second language using content included in the one or more documents in the second language.
摘要:
Near-duplicate documents may be identified by (a) accepting a set of documents, (b) processing the set of documents to determine a first set of near-duplicate documents using a first document similarity technique, and (c) processing the first set of near duplicate documents to determine a second set of near-duplicate documents using a second document similarity technique. The first document similarity technique might be token order dependent, and the second document similarity technique might be order independent. The first document similarity technique might be token frequency independent, and the second document similarity technique might be frequency dependent. The first document similarity technique might determine whether two documents are near-duplicates using representations based on a subset of the words or tokens of the documents, and the second document similarity technique might determine whether two documents are near-duplicates using representations based on all of the words or tokens of the documents. The first document similarity technique might use set intersection to determine whether or not documents are near-duplicates, and the second document similarity technique might use random projections to determine whether or not documents are near-duplicates.
摘要:
A search query for a search engine may be improved by incorporating alternate terms into the search query that are semantically similar to terms of the search query, taking into account information derived from the search query. An initial set of alternate terms that may be semantically similar to the original terms in the search query is generated. The initial set of alternate terms may be compared to information derived from the original search query. One example of such information is a set of documents retrieved in response to a search performed using the initial search query. One or more of the alternate terms may be added to the original search query based on their relationship to the information derived from the original search query.
摘要:
A system limits search results based on context information. The system obtains the context information and a search query, and obtains a set of references to documents in response to the search query. The system then filters the set of references based on the context information and presents the filtered set of references to a user.
摘要:
A system performs cross-language query translations. The system receives a search query that includes terms in a first language and determines possible translations of the terms of the search query into a second language. The system also locates documents for use as parallel corpora to aid in the translation by: (1) locating documents in the first language that contain references that match the terms of the search query and identify documents in the second language; (2) locating documents in the first language that contain references that match the terms of the query and refer to other documents in the first language and identify documents in the second language that contain references to the other documents; or (3) locating documents in the first language that match the terms of the query and identify documents in the second language that contain references to the documents in the first language. The system may use the second language documents as parallel corpora to disambiguate among the possible translations of the terms of the search query and identify one of the possible translations as a likely translation of the search query into the second language.
摘要:
In a computer system, an apparatus is configured to collect performance data of a computer system including a plurality of processors for concurrently executing instructions of a program. A plurality of performance counters are coupled to each processor. The performance counters store performance data generated by each processor while executing the instructions. An interrupt handler executes on each processors, the interrupt handler samples the performance data of the processor in response to interrupts. A first memory includes a hash table associated with each interrupt handler, the hash table stores the performance data sampled by the interrupt handler executing on the processor. A second memory includes an overflow buffer, the overflow buffer stores the performance data while portions of the hash tables are active or full. A third memory includes a user buffer, and means are provided for periodically flushing the performance data from the hash tables and the overflow to the user buffer.
摘要:
A computer-implemented method for assigning computing tasks to computers in a group is disclosed. The method includes determining, for each computer in a group of computers, an ability of the computer to execute tasks expected to be received by the group of computers; generating an ability indicator for each computer based on the ability determination for the computer; and assigning an incoming computing task to one of the computers using the ability indicator.
摘要:
A system facilitates a search by a user. The system detects selection of one or more words in a document currently accessed by the user, generates a search query using the selected word(s), and retrieves a document based on the search query. When the document includes one or more links corresponding to a linked document, the system analyzes each of the links, prefetches the linked documents corresponding to a number of the links, and presents the document to the user. The system receives selection of one of the links and retrieves the linked document corresponding to the selected link. The system identifies one or more pieces of information in the retrieved document, determines a link to a related document for each of the identified pieces of information, and provides the determined links with the related document to the user.
摘要:
A system receives a voice search query from a user, derives recognition hypotheses from the voice search query, and determines scores associated with the recognition hypotheses, the scores being based on a comparison of the recognition hypotheses to previously received search queries. The system discards at least one of the recognition hypotheses that is associated with a first score that is less than a threshold value, and constructs a first query using at least one non-discarded recognition hypothesis, where the at least one first non-discarded recognition hypothesis is associated with a second score that at least meets the threshold value. The system forwards the first query to a search system, receives first results associated with the first query, and provides the first results to the user.
摘要:
Methods and apparatus consistent with the invention provide improved organization of documents responsive to a search query. In one embodiment, a search query is received and a list of responsive documents is identified. The responsive documents are organized based in whole or in part on usage statistics.