INVALID TRAFFIC DETECTION USING EXPLAINABLE UNSUPERVISED GRAPH ML

    公开(公告)号:US20230199026A1

    公开(公告)日:2023-06-22

    申请号:US17558342

    申请日:2021-12-21

    CPC classification number: H04L63/1483 H04L63/1425 G06N20/00

    Abstract: Herein are graph machine learning explainability (MLX) techniques for invalid traffic detection. In an embodiment, a computer generates a graph that contains: a) domain vertices that represent network domains that received requests and b) address vertices that respectively represent network addresses from which the requests originated. Based on the graph, domain embeddings are generated that respectively encode the domain vertices. Based on the domain embeddings, multidomain embeddings are generated that respectively encode the network addresses. The multidomain embeddings are organized into multiple clusters of multidomain embeddings. A particular cluster is detected as suspicious. In an embodiment, an unsupervised trained graph model generates the multidomain embeddings. Based on the clusters of multidomain embeddings, feature importances are unsupervised trained. Based on the feature importances, an explanation is automatically generated for why an object is or is not suspicious. The explained object may be a cluster or other batch of network addresses or a single network address.

    Dynamic asynchronous traversals for distributed graph queries

    公开(公告)号:US11675785B2

    公开(公告)日:2023-06-13

    申请号:US16778668

    申请日:2020-01-31

    CPC classification number: G06F16/24526 G06F16/2471

    Abstract: Techniques are described for enabling in-memory execution of any-sized graph data query by utilizing both depth first search (DFS) principles and breadth first search (BFS) principles to control the amount of memory used during query execution. Specifically, threads implementing a graph DBMS switch between a BFS mode of data traversal and a DFS mode of data traversal. For example, when a thread detects that there are less than a configurable threshold number of intermediate results in memory, the thread enters BFS-based traversal techniques to increase the number of intermediate results in memory. When the thread detects that there are at least the configurable threshold number of intermediate results in memory, the thread enters DFS mode to produce final results, which generally works to move the intermediate results that are currently available in memory to final query results, thereby reducing the number of intermediate results in memory.

    Method for generic vectorized d-heaps

    公开(公告)号:US11379232B2

    公开(公告)日:2022-07-05

    申请号:US16399226

    申请日:2019-04-30

    Abstract: Techniques are provided for obtaining generic vectorized d-heaps for any data type for which horizontal aggregation SIMD instructions are not available, including primitive as well as complex data types. A generic vectorized d-heap comprises a prefix heap and a plurality of suffix heaps. Each suffix heap of the plurality of suffix heaps comprises a d-heap. A plurality of key values stored in the heap are split into key prefix values and key suffix values. Key prefix values are stored in the prefix heap and key suffix values are stored in the plurality of suffix heaps. Each entry in the prefix heap includes a key prefix value of the plurality of key values and a reference to the suffix heap of the plurality of suffix heaps that includes all key suffix values of the plurality of key values that share the respective key prefix value.

    Method for fast and consistent invocation of concurrently modifiable user-defined functions

    公开(公告)号:US10990594B2

    公开(公告)日:2021-04-27

    申请号:US15971664

    申请日:2018-05-04

    Abstract: Database techniques are provided that use state machines to manage polyglot subroutine bindings for database commands. In an embodiment, a computer receives a database command that contains call sites (CSs). Each CS is associated with a user defined logic (UDL). The computer associates an initial operational state with each of the CSs. During a first invocation of a particular CS, the CS becomes initialized and transitions to an optimized state that is configured for streamlined invocation of the UDL. The UDL is invoked to contribute data to a partial result for the database command. Eventually, command execution stalls and causes the CS to transition to an unready state, which entails releasing shared resources. Later execution resumes and during another invocation of the CS, resources are reacquired, the CS is made ready and transitioned back to the optimized state. The CS may again be repeatedly invoked while revisiting the optimized state.

Patent Agency Ranking