Abstract:
Embodiments are directed towards a Modified Sequitur algorithm (MSA) using pipelining and indexed arrays to identify trending topics within a plurality of documents having user generated content (UGC). The documents are parallelized and distributed across a plurality of network devices, which place at least some of the received documents into a buffer for which the MSA may then be applied to the documents within the buffer to identify n-grams or phrases within the documents' contents. The identified phrases are further analyzed to remove extraneous co-occurrences of phrases, and/or words based on a part of speech analysis. A weighting of the remaining phrases is used to identify trending topic phrases. Links to content in the plurality of UGC documents that is associated with the trending topic phrases may then be displayed to a client device.
Abstract:
Embodiments are directed towards modifying a distribution of writers as either a push writer or a pull writer based on a cost model that decides for a given content reader whether it is more effective for the writer to be a pull writer or a push writer. A cache is maintained for each content reader for caching content items pushed by a push writer in the content writer's push list of writers when the content is generated. At query time, content items are pulled by the content reader based on writers a content reader's pull list. One embodiment of the cost model employs data about a previous number of requests for content items for a given writer for a number of previous blended display results of content items. When a writer is determined to be popular, mechanisms are proposed for pushing content items to a plurality of content readers.
Abstract:
A query-centric system and process for distributing reverse indices for a distributed content system. Relevance ranking techniques in organizing distributed system indices. Query-centric configuration subprocesses (1) analyze query data, partitioning terms for reverse index server(s) (RIS), (2) distribute each partitioned data set by generally localizing search terms for the RIS that have some query-centric correlation, and (3) generate and maintain a map for the partitioned reverse index system terms by mapping the terms for the reverse index to a plurality of different index server nodes. Indexing subprocess element builds distributed reverse indices from content host indices. Routines of the query execution use the map derived in the configuration to more efficiently return more relevant search results to the searcher.
Abstract:
The present invention is directed towards systems and method for organization of bookmarks. The method according to one embodiment comprises retrieving one or more bookmarks associated with one or more content items, a given bookmark generated by a user of a client device and identifying one or more tags associated with one or uniform resource locators corresponding to the or more bookmarks. A bookmark folder hierarchy is created through use of a clustering algorithm on the basis of the one or more tags associated with the one or more uniform resource locators corresponding to the one or more bookmarks.
Abstract:
In a method for increasing peer privacy, a path for information is formed from a provider to a requestor through a plurality of peers in response to a received request for the information. Each peer of the plurality of peers receives a respective set-up message comprising of a predetermined label and an identity of a next peer for the information. The information is transferred over the path in a message, where the message comprises a message label configured to determine a next peer according to the path in response to the message label matching the previously received predetermined label.
Abstract:
A method and a computer-readable medium are provided which perform screen scraping via grammar induction. The computer-readable medium stores instructions of the method, the instructions directing a computer processor to intercept display information transmitted to a computer-implemented display device representing information stored in a data source; induce a grammar via statistical analysis of the intercepted display information; provide the grammar to a parser-generator to generate a parser corresponding to the induced grammar; and perform screen scraping using the generated parser.
Abstract:
A method for sharing content with a user includes receiving from a user a first set of keywords for annotating an annotated user; receiving from the user a second set of keywords that designate whether annotated content annotated by at least one keyword included in the second set of keywords may be shared with the annotated user; storing in a data store a first association of the first set of keywords with the annotated user, and a second association of the second set of keywords with the annotated user; receiving a keyword selection for a select keyword and an identifier for the annotated user; and displaying on the client system content annotated by the select keyword if the annotated user is annotated by at least one keyword in the first set of keywords, and if the select keyword is included in the second set of keywords.
Abstract:
A message is received from a first wireless node in a first wireless community. The message is for a second wireless node in a second wireless community. Location information for the second wireless node is determined using a distributed hash table (DHT) overlay network. The message is routed to a second wireless community using the location information.
Abstract:
A data model represents semantic information associated with objects stored in a file system. The data model includes a first object identifier, a second object identifier and a relation identifier. The first object identifier identifies a first object stored in the file system. The second object identifier identifies a second object stored in the file system, wherein the second object is related to the first object. The relation identifier identifies a relationship between the first object and the second object.
Abstract:
In a method for creating expressway for overlay routing, an existing peer-to-peer network is organized into a plurality of zones. The plurality of zones is organized into a plurality of levels. Neighboring zones are identified for each zone of the plurality of zones. One or more representatives are identified for each neighboring zone. A routing table is created based the plurality of zones, the neighboring zones, the one or more representatives, and the plurality of levels.