Abstract:
A combined directed graph is created having a corresponding node for each node in a first directed graph lacking a corresponding node in a second directed graph, each node in the second graph lacking a corresponding node in the first graph, and each node in the first graph having a corresponding node in the second graph. A corresponding directed arc is created in the combined directed graph for each arc in the first graph lacking a corresponding arc in the second directed graph, each arc in the second graph lacking a corresponding arc in the first graph, and each arc in the first graph having a corresponding arc in the second graph. A recommendation is output for a user to interact with a recommended object based on an object interaction and a conditional probability, in the combined graph, which corresponds to the recommended object and the object interaction.
Abstract:
A system and method for evaluating claims from sources to update database records. A trust score is developed for each source. If a source submits a claim, the trust score for that source and the value of the claim are evaluated against prior conflicting claims. If the current claim is deemed the most likely, then it is adopted as provisional “truth”. If not, the current claim is rejected.
Abstract:
The technology disclosed relates to improving parallel functional processing using abstractions and methods defined based on category theory. In particular, the technology disclosed provides a range of useful categorical functions for processing large data sets in parallel. These categorical functions manage all phases of distributed computing, including dividing a data set into subsets of approximately equal size and combining the results of the subset calculations into a final result, while hiding many of the low-level programming details. These categorical functions are extraordinarily well-ordered and have a sophisticated type system and type inference, which allows for generating maps and reducing them in an elegant and succinct way using concise and expressive programs that can significantly efficientize a whole software development process.
Abstract:
Methods and systems are provided for evaluating standing queries against updated contact entries configured as a stream of facts. The method includes resolving the standing queries into an array of rules, each rule having a first and a second condition; sorting t one of the facts into a first property and a second property; comparing the first property of the fact to the first condition of each rule in the array of rules to produce a first subset of matching rules; comparing the second property of the fact to the second condition of each rule in the first subset of rules to produce a second subset of matching rules; and reporting at least one of the second subset of rules to an author of the matching rule. The method further includes populating a first hash with indicia of the first subset, and populating a second hash with the second subset.
Abstract:
Processing user-submitted updates based on user reliability scores is described. A system calculates an update score, for an update submitted by a user, based on a similarity of a field value provided by the update to corresponding field values in identified records. The system calculates a user score based on update scores, including the update score, calculated for corresponding updates submitted by the user. The system processes the update based on the user score.
Abstract:
Systems and methods for processing user-submitted updates based on user reliability scores. An update score is determined for an update submitted by a user based on a similarity of a field value provided by the update to corresponding field values in identified records. A user score is determined based on update scores, including the update score, determined for corresponding updates submitted by the user. The update is then processed based on the user score.
Abstract:
A system receives an association of first item with first system user, generates first hash value by applying first hash function associated with first system user to first item identifier associated with first item, and sets a bit corresponding to first hash value in array. The system receives an association of second item with second system user, generates second hash value by applying second hash function associated with second user to second item identifier associated with second item, and sets a bit corresponding to second hash value in array. The system receives a request to determine whether third item is associated with first system user, generates third hash value by applying first hash function to third item identifier associated with third item, and outputs message that third item is not associated with first user if a bit corresponding to third hash value is not set in array.
Abstract:
The technology disclosed describes systems and methods for generating feature vectors from resource description framework (RDF) graphs. Machine learning tasks frequently operate on vectors of features. Available systems for parsing multiple documents often generate RDF graphs. Once a set of interesting features to be considered has been established, the disclosed technology describes systems and methods for generating feature vectors from the RDF graphs for the documents. In one example setting, a machine learning system can use generated feature vectors to determine how interesting a news article might be, or to learn information-of-interest about a specific subject reported in multiple articles. In another example setting, viable interview candidates for a particular job opening can be identified using feature vectors generated from a resume database, using the disclosed systems and methods for generating feature vectors from RDF graphs.
Abstract:
Systems and methods are provided for controlling access to data of heterogeneous origin. A system creates combined access rights from access rights and other access rights for combined data that includes data and other data. The system receives a request to access data that is part of the combined data. The system determines whether to provide access to at least part of the data based on access rights that are part of the combined access rights. The system provides access to at least part of the data in response to a determination to provide access to at least part of the data based on the access rights that are part of the combined access rights.
Abstract:
The technology disclosed relates to automatic generation of tuples from a record set for outlier analysis. Applying this new technology, user need not specify which 1-tuples to combine into n-tuples. The tuples are generated from structured records organized into features (that also could be fields, objects or attributes.) Tuples are generated from combinations of feature values in the records. Thresholding is applied to manage the number of tuples generated. The technology disclosed further relates to indexing and searching high dimensional tuple spaces in a computer-implemented system.