Abstract:
Systems, devices and methods are described herein for segmentation of content, and more specifically for segmentation of content in a content management system. In one aspect, a method may include receiving content associated with speech, text, or closed captioning data. The speech, the text, or the closed captioning data may be analyzed to derive at least one of a topic, subject, or event for at least a portion of the content. The content may be divided into two or more content segments based on the analyzing. At least one of the topic, the subject, or the event may be associated with at least one of the two or more content segments based on the analyzing. At least one of the two or more content segments may then be published such that each of the two or more content segments is individually accessible.
Abstract:
A system includes storage of data of a hierarchy, where each node of the hierarchy is represented by a row, and each row includes a level of its respective node, a pointer to a lower bound entry of an order index structure associated with the hierarchy, and a pointer to an upper bound entry of the order index structure associated with the hierarchy, reception of a pointer l, and determination of an entry e of the order index structure to which the received pointer l points.
Abstract:
The present disclosure is directed to providing dynamic indexer discovery. An index manager, which may also be known as a cluster master, is configured to track the statuses and capabilities of indexers and provide the statuses and capabilities obtained from the indexers to data collectors, such as forwarders. The data collectors may use the statuses and capabilities associated with the indexers to load balance transmission of data to the indexers. Dynamic indexer discovery may eliminate the need to manually reconfigure data collectors when the status of an indexer changes because the information may be obtained from the index manager without the need to reinitialize the data collectors.
Abstract:
A method, system, and computer program product to manage a database is disclosed. The method, system, and computer program product may include structuring the database to have a first table having an index and a second table. A first key of the first table may be related to a second key of the second table. The method, system, and computer program product may include creating an entry locator in the index. The method, system, and computer program product may include maintaining an association between the second key of the second table and the entry locator of the index.
Abstract:
A computer-implemented technique can include receiving, at a server, labeled training data including a plurality of groups of words, each group of words having a predicate word, each word having generic word embeddings. The technique can include extracting, at the server, the plurality of groups of words in a syntactic context of their predicate words. The technique can include concatenating, at the server, the generic word embeddings to create a high dimensional vector space representing features for each word. The technique can include obtaining, at the server, a model having a learned mapping from the high dimensional vector space to a low dimensional vector space and learned embeddings for each possible semantic frame in the low dimensional vector space. The technique can also include outputting, by the server, the model for storage, the model being configured to identify a specific semantic frame for an input.
Abstract:
Systems, methods, and other embodiments associated with real-time text indexing are described. One example method includes receiving a document for indexing in a search system that includes a mature index and indexing the received document in a staging index. The staging index may be stored in direct access memory associated with query processing that does not degrade query performance even when postings become fragmented. The staging index and the mature text index are accessed to process queries on the search system. The example method may also include periodically merging the staging index into the mature index based on query feedback.
Abstract:
In one embodiment, one or more computing devices receive, from a client device of a first user, a query from the first user. The computer devices search a social graph to identify one or more nodes of the social graph that are relevant to the query. The computer devices obtain a static rank for each identified node. The static rank is based at least in part on a number of edges of a particular edge type that are connected to the node in the graph or attributes of edges connected to the node in the graph. The computer devices send to the client device of the first user for display, a search-results page responsive to the received query. The search-results page includes reference to one or more nodes having a static rank greater than a threshold rank.
Abstract:
Techniques are described herein for creating an algorithm for batch mode processing against big data. The techniques involve receiving one or more user commands from a set number of commands that correspond one-to-one with a set number of low-level database operations. In a preferred embodiment, the set of database operations includes only FILTERS, SORTS, AGREGGATES, and JOINS.In the algorithm formation process, database operations are performed on a sample population of records. The user drills down to a set of useful records by performing database operations against the results of the previous database operations. While the database cluster is receiving operations, the system is tracking the operations in a dependency graph. The chains selected within the dependency graph indicate which operations are used to create the algorithm. To generate the algorithm, the database cluster reverse engineers the logic for performing those operations against big data.
Abstract:
A multi-user search system with methodology for personalized search query autocomplete. In one embodiment, for example, a method for personalized search query autocomplete includes receiving, from an end-user computing device of an authenticated user, a completion search query including a completion token; determining an identifier of an authorized document namespace the authenticated user is permitted to access; generating an index key including the authorized document namespace identifier as a prefix and the completion token as a suffix; accessing an index dictionary with the index key to identify and iterate over a plurality of prefixed index tokens until a stop condition is reached, each of the plurality of prefixed index tokens including the authorized document namespace identifier as a prefix and the each index token as a suffix, the completion token being a prefix of or matching the each index token; and for each prefixed index token of the plurality of prefixed index tokens, determining whether any documents identified in a postings list associated with the each prefixed index token satisfies the completion query, and returning filenames of any such documents satisfying the completion query in an answer to the completion query.
Abstract:
Methods and systems for utilizing a database are disclosed. The methods and systems determine a key representative of a storage location of first RDF data in a NoSQL database. In addition, the methods and systems read the first RDF data in the NoSQL database using the key. The methods and systems also write second RDF data derived from the first RDF data into a second database stored in memory. The methods and systems may also modify the second RDF data, and write third RDF data derived from the modified second RDF data into the NoSQL database.