JACCARD SIMILARITY ESTIMATION OF WEIGHTED SAMPLES: SCALING AND RANDOMIZED ROUNDING SAMPLE SELECTION WITH CIRCULAR SMEARING

    公开(公告)号:US20200019814A1

    公开(公告)日:2020-01-16

    申请号:US16579706

    申请日:2019-09-23

    Inventor: Mark Manasse

    Abstract: The disclosed systems and methods include pre-calculation, per object, of object feature bin values, for identifying close matches between objects, such as text documents, that have numerous weighted features, such as specific-length word sequences. Predetermined feature weights get scaled with two or more selected adjacent scaling factors, and randomly rounded. The expanded set of weighted features of an object gets min-hashed into a predetermined number of feature bins. For each feature that qualifies to be inserted by min-hashing into a particular feature bin, and across successive feature bins, the expanded set of weighted features get min-hashed and circularly smeared into the predetermined number of feature bins. Completed pre-calculated sets of feature bin values for each scaling of the object, together with the scaling factor, are stored for use in comparing sampled features of the object with sampled features of other objects by calculating an estimated Jaccard similarity index.

    IDENTIFYING RECURRING SEQUENCES OF USER INTERACTIONS WITH AN APPLICATION

    公开(公告)号:US20200019544A1

    公开(公告)日:2020-01-16

    申请号:US16580846

    申请日:2019-09-24

    Inventor: Sönke Rohde

    Abstract: Disclosed are database systems, computing devices, methods, and computer program products for identifying recurring sequences of user interactions with an application. In some implementations, a server of a database system provides a user interface of the application for display at a computing device. The database system stores data objects identifying a first plurality of user interactions with the application. The server receives information representing a second plurality of user interactions with the application. The server updates the database system to further identify the second user interactions. The server identifies a recurring sequence of user interactions from the first and second user interactions as resulting in a first target state of the application. The server updates the database system to associate the recurring sequence of user interactions with the first target state of the application.

    SYSTEM AND METHOD TO NAVIGATE 3D DATA ON MOBILE AND DESKTOP

    公开(公告)号:US20200019296A1

    公开(公告)日:2020-01-16

    申请号:US16579695

    申请日:2019-09-23

    Abstract: Disclosed is a system and method to navigate high-dimensional data in order to enhance the analytical capabilities of a data consumer. The technology disclosed can use a stereoscopic 3D viewer with a smartphone, a pair of 3D glasses with desktop, or a projected 3D image on a table. The solution provides a novel and accessible way of navigating high-dimensional data that has been organized into groups of two or three dimensions. Navigation is possible in all 3 dimensions (x, y, z) to explore the full potential of underlying data and elicit powerful insights to be acted on through informed decisions.

    Client fingerprinting for information system security

    公开(公告)号:US10536439B2

    公开(公告)日:2020-01-14

    申请号:US15589220

    申请日:2017-05-08

    Abstract: Client fingerprints can be used to detect and defend against malware and hacking into information systems more effectively than using IP addresses. A unique client fingerprint can be based on data found in the client's SSL client hello packet. SSL version, cipher suites, and other fields of the packet can be utilized, preferably utilizing individual field values in the order in which they appear in the packet. The ordered values are converted to decimal values, separated by delimiters, and concatenated to form an identifier string. The identifier string may be mapped, preferably by a hash function, to form the client fingerprint. The client fingerprint may be logged, and whitelists and blacklists may be formed using client fingerprints so formed.

    DISTRIBUTED DATA PROCESSING IN MULTI-TENANT ENVIRONMENTS

    公开(公告)号:US20200004858A1

    公开(公告)日:2020-01-02

    申请号:US16024264

    申请日:2018-06-29

    Abstract: Methods, systems, and devices for data processing within a distributed data system are described. In a multi-tenant distributed data system, a provider may supply executable code for processing data using declarative processing instructions received from a tenant. For example, a tenant may provide tenant-specific processing instructions for a requested set of data. The processing instructions may indicate input information (e.g., a data structure, tenant-specific fields, etc.), transformation information (e.g., from a set of pre-defined transformations), and output information. The provider-supplied code may use the tenant-specific processing instructions to process and generate the requested set of data, where the code may be executed by multiple nodes within the system. As such, the code executed by multiple nodes may utilize the input information, transformation information, and output information from the tenant-specific processing instructions to generate the requested data and provide the data to the tenant.

    MULTI-MASTER DATA REPLICATION IN A DISTRIBUTED MULTI-TENANT SYSTEM

    公开(公告)号:US20200004734A1

    公开(公告)日:2020-01-02

    申请号:US16566613

    申请日:2019-09-10

    Abstract: A multi-master replication system is disclosed. The multi-master replication system allows a large set of peer instances to collaboratively replicate data to each other. According to an example, a change detection thread running on a first server associated with a first instance of multiple instances of a replicated database monitors for changes to any of multiple records within one or more shared tables of the replicated database. Responsive to detection of a change to a record, an item is stored by the change detection thread onto a queue containing information regarding the change. Groups of changes are packaged into multiple chunks, in which each chunk (i) corresponds to a discrete unit of progress for both change detection and transport; (ii) is associated with multiple changed records; (iii) contains metadata about the multiple changed records; and (iv) does not contain data from the one or more shared tables.

    USER DEVICE VALIDATION AT AN APPLICATION SERVER

    公开(公告)号:US20190394042A1

    公开(公告)日:2019-12-26

    申请号:US16015768

    申请日:2018-06-22

    Inventor: Prasad Peddada

    Abstract: Methods, systems, and devices for validation at an application server are described. The application server may validate a user device utilizing a public-private key pair, and may refrain from establishing a database connection until the user device is validated. For example, the application server may transmit a private key and a public key identifier to the user device. When the application server receives a session establishment message that is based on a private key and that contains the public key identifier, the application server may determine the public key of the public-private key pair based on the identifier. The application server may validate that the session establishment message is received from the user device based on the private key and the determined public key. Based on this validation procedure, the application server may establish a database connection with a database, granting the validated user device access to requested data.

    Systems, methods, and apparatuses for implementing a stateless, deterministic scheduler and work discovery system with interruption recovery

    公开(公告)号:US10514951B2

    公开(公告)日:2019-12-24

    申请号:US15587161

    申请日:2017-05-04

    Abstract: In accordance with disclosed embodiments, there are provided systems, methods, and apparatuses for implementing a stateless, deterministic scheduler and work discovery system with interruption recovery. For instance, according to one embodiment, there is disclosed a system to implement a stateless scheduler service, in which the system includes: a processor and a memory to execute instructions at the system; a compute resource discovery engine to identify one or more computing resources available to execute workload tasks; a workload discovery engine to identify a plurality of workload tasks to be scheduled for execution; a local cache to store information on behalf of the compute resource discovery engine and the workload discovery engine; a scheduler to request information from the local cache specifying the one or more computing resources available to execute workload tasks and the plurality of workload tasks to be scheduled for execution; and further in which the scheduler is to schedule at least a portion of the plurality of workload tasks for execution via the one or more computing resources based on the information requested. Other related embodiments are disclosed.

Patent Agency Ranking