SYSTEMS AND METHODS FOR DATA COMPRESSION
    1.
    发明申请
    SYSTEMS AND METHODS FOR DATA COMPRESSION 有权
    用于数据压缩的系统和方法

    公开(公告)号:US20150032757A1

    公开(公告)日:2015-01-29

    申请号:US13951433

    申请日:2013-07-25

    Applicant: Facebook, Inc.

    CPC classification number: G06F17/30321

    Abstract: Event data comprising an unordered string set may be received. String set dictionary indexes may be assigned for strings of the unordered string set in a string set dictionary. The unordered string set may be sorted to provide a sorted series based on the string set dictionary indexes for the unordered string set. A differential series may be computed from the sorted series. The differential series may be encoded into binary code words. In an embodiment, the event data also may comprise strings. A schema version associated with the strings in a row may be determined. Computing resources may be allocated based on the schema version.

    Abstract translation: 可以接收包括无序字符串集的事件数据。 可以为字符串集字典中的无序字符串集的字符串分配字符串集字典索引。 无序字符串集可以被排序以提供基于无序字符串集的字符串集字典索引的排序序列。 可以从排序的系列中计算差分序列。 差分序列可以被编码成二进制码字。 在一个实施例中,事件数据也可以包括字符串。 可以确定与行中的字符串相关联的模式版本。 可以基于模式版本来分配计算资源。

    SYSTEMS AND METHODS FOR PRUNING DATA BY SAMPLING

    公开(公告)号:US20170147615A1

    公开(公告)日:2017-05-25

    申请号:US15396424

    申请日:2016-12-31

    Applicant: Facebook, Inc.

    CPC classification number: G06F16/215 G06F16/125 G06F16/21 G06F16/24565

    Abstract: Techniques provided herein allow for management of data. In various embodiments, systems and methods prune and retain data being managed by a data management system, where the managed data can include log data aggregated from one or more servers for analysis purposes. According to some embodiments, pruning can be triggered according to one or more constraints, such as the age of managed data (e.g., retain only 30 days of managed data) or the memory space required to store the managed data (e.g., retain only 100 GB worth of managed data). The constraints that trigger data pruning can be based on a data retention policy. When triggered, pruning can be performed on a fraction of the managed data stored based on the data retention policy (e.g., 3 days of full managed data, 27 days of pruned managed data). The pruning may be performed by sampling, at a desired rate, the managed data.

    Systems and methods for data compression
    3.
    发明授权
    Systems and methods for data compression 有权
    数据压缩的系统和方法

    公开(公告)号:US09128968B2

    公开(公告)日:2015-09-08

    申请号:US13951433

    申请日:2013-07-25

    Applicant: Facebook, Inc.

    CPC classification number: G06F17/30321

    Abstract: Event data comprising an unordered string set may be received. String set dictionary indexes may be assigned for strings of the unordered string set in a string set dictionary. The unordered string set may be sorted to provide a sorted series based on the string set dictionary indexes for the unordered string set. A differential series may be computed from the sorted series. The differential series may be encoded into binary code words. In an embodiment, the event data also may comprise strings. A schema version associated with the strings in a row may be determined. Computing resources may be allocated based on the schema version.

    Abstract translation: 可以接收包括无序字符串集的事件数据。 可以为字符串集字典中的无序字符串集的字符串分配字符串集字典索引。 无序字符串集可以被排序以提供基于无序字符串集的字符串集字典索引的排序序列。 可以从排序的系列中计算差分序列。 差分序列可以被编码成二进制码字。 在一个实施例中,事件数据也可以包括字符串。 可以确定与行中的字符串相关联的模式版本。 可以基于模式版本来分配计算资源。

    SYSTEMS AND METHODS FOR EFFICIENT DATA INGESTION AND QUERY PROCESSING
    4.
    发明申请
    SYSTEMS AND METHODS FOR EFFICIENT DATA INGESTION AND QUERY PROCESSING 有权
    有效数据采集和查询处理的系统和方法

    公开(公告)号:US20150032725A1

    公开(公告)日:2015-01-29

    申请号:US13951431

    申请日:2013-07-25

    Applicant: Facebook, Inc.

    Abstract: A query may be provided to aggregators at hierarchical levels in an in-memory data storage module. The query may be provided to leaf nodes of the in-memory data storage module. The leaf nodes may execute the query, returning results of the query to the aggregators. One or more aggregations may be performed based on the results. In an embodiment, log entries associated with a logged event may be serialized and divided into distributed chunks for storage in the leaf nodes. A leaf node, from the leaf nodes, having storage capacity for a distributed chunk may be identified. The distributed chunk may be stored in the leaf node.

    Abstract translation: 在存储器内数据存储模块中可以向分层级的聚合器提供查询。 该查询可以被提供给存储器内数据存储模块的叶节点。 叶节点可以执行查询,将查询的结果返回到聚合器。 可以基于结果执行一个或多个聚合。 在一个实施例中,与记录的事件相关联的日志条目可以被序列化并且被划分成用于存储在叶节点中的分布的块。 可以识别来自叶节点的具有分布块的存储容量的叶节点。 分布式块可以存储在叶节点中。

    Systems and methods for efficient data ingestion and query processing
    5.
    发明授权
    Systems and methods for efficient data ingestion and query processing 有权
    高效数据采集和查询处理的系统和方法

    公开(公告)号:US09442967B2

    公开(公告)日:2016-09-13

    申请号:US13951431

    申请日:2013-07-25

    Applicant: Facebook, Inc.

    Abstract: A query may be provided to aggregators at hierarchical levels in an in-memory data storage module. The query may be provided to leaf nodes of the in-memory data storage module. The leaf nodes may execute the query, returning results of the query to the aggregators. One or more aggregations may be performed based on the results. In an embodiment, log entries associated with a logged event may be serialized and divided into distributed chunks for storage in the leaf nodes. A leaf node, from the leaf nodes, having storage capacity for a distributed chunk may be identified. The distributed chunk may be stored in the leaf node.

    Abstract translation: 在存储器内数据存储模块中可以向分层级的聚合器提供查询。 该查询可以被提供给存储器内数据存储模块的叶节点。 叶节点可以执行查询,将查询的结果返回到聚合器。 可以基于结果执行一个或多个聚合。 在一个实施例中,与记录的事件相关联的日志条目可以被序列化并且被划分成用于存储在叶节点中的分布的块。 可以识别来自叶节点的具有分布块的存储容量的叶节点。 分布式块可以存储在叶节点中。

    SYSTEMS AND METHODS FOR PRUNING DATA BY SAMPLING
    6.
    发明申请
    SYSTEMS AND METHODS FOR PRUNING DATA BY SAMPLING 有权
    通过采样来进行数据采集的系统和方法

    公开(公告)号:US20150032707A1

    公开(公告)日:2015-01-29

    申请号:US13951435

    申请日:2013-07-25

    Applicant: Facebook, Inc.

    Abstract: Techniques provided herein allow for management of data. In various embodiments, systems and methods prune and retain data being managed by a data management system, where the managed data can include log data aggregated from one or more servers for analysis purposes. According to some embodiments, pruning can be triggered according to one or more constraints, such as the age of managed data (e.g., retain only 30 days of managed data) or the memory space required to store the managed data (e.g., retain only 100 GB worth of managed data). The constraints that trigger data pruning can be based on a data retention policy. When triggered, pruning can be performed on a fraction of the managed data stored based on the data retention policy (e.g., 3 days of full managed data, 27 days of pruned managed data). The pruning may be performed by sampling, at a desired rate, the managed data.

    Abstract translation: 本文提供的技术允许管理数据。 在各种实施例中,系统和方法修剪和保留由数据管理系统管理的数据,其中被管理数据可以包括从一个或多个服务器聚集的日志数据用于分析目的。 根据一些实施例,可以根据一个或多个约束(诸如被管理数据的年龄(例如,仅保留30天的被管理数据))或存储被管理数据所需的存储空间来触发修剪(例如,仅保留100个 GB值得管理的数据)。 触发数据修剪的约束可以基于数据保留策略。 触发时,可以根据数据保留策略(例如,3天的完整托管数据,27天的已修剪的托管数据)存储的一小部分托管数据执行修剪。 修剪可以通过以期望的速率对被管理数据进行采样来执行。

    Systems and methods for pruning data by sampling

    公开(公告)号:US09600503B2

    公开(公告)日:2017-03-21

    申请号:US13951435

    申请日:2013-07-25

    Applicant: Facebook, Inc.

    Abstract: Techniques provided herein allow for management of data. In various embodiments, systems and methods prune and retain data being managed by a data management system, where the managed data can include log data aggregated from one or more servers for analysis purposes. According to some embodiments, pruning can be triggered according to one or more constraints, such as the age of managed data (e.g., retain only 30 days of managed data) or the memory space required to store the managed data (e.g., retain only 100 GB worth of managed data). The constraints that trigger data pruning can be based on a data retention policy. When triggered, pruning can be performed on a fraction of the managed data stored based on the data retention policy (e.g., 3 days of full managed data, 27 days of pruned managed data). The pruning may be performed by sampling, at a desired rate, the managed data.

    Systems and methods for detecting missing data in query results
    8.
    发明授权
    Systems and methods for detecting missing data in query results 有权
    用于检测查询结果中缺少数据的系统和方法

    公开(公告)号:US09501521B2

    公开(公告)日:2016-11-22

    申请号:US13951438

    申请日:2013-07-25

    Applicant: Facebook, Inc.

    CPC classification number: G06F17/30424

    Abstract: Techniques provided herein allow for estimating data missing in query results provided in response to queries performed on data managed by a data management system. In the event that one or more leaf nodes are unable or unavailable to process a query, a final query result provided in response to the original query may be missing data that exists on those leaf nodes. A data accounting service monitors what managed data is being stored on the leaf nodes and on what leaf node. The data accounting service can estimate how much data is missing from a final query result when one or more of the leaf nodes are unable or unavailable to process a query.

    Abstract translation: 这里提供的技术允许估计在响应于对由数据管理系统管理的数据执行的查询而提供的查询结果中丢失的数据。 在一个或多个叶节点不能或不可用于处理查询的情况下,响应于原始查询提供的最终查询结果可能是丢失那些叶节点上存在的数据。 数据记帐服务监视叶节点上以及叶节点上存储的托管数据。 当一个或多个叶节点不能或不可用来处理查询时,数据记帐服务可以从最终查询结果中估计丢失多少数据。

    SYSTEMS AND METHODS FOR DETECTING MISSING DATA IN QUERY RESULTS
    9.
    发明申请
    SYSTEMS AND METHODS FOR DETECTING MISSING DATA IN QUERY RESULTS 有权
    用于检测查询结果中的丢失数据的系统和方法

    公开(公告)号:US20150032726A1

    公开(公告)日:2015-01-29

    申请号:US13951438

    申请日:2013-07-25

    Applicant: Facebook, Inc.

    CPC classification number: G06F17/30424

    Abstract: Techniques provided herein allow for estimating data missing in query results provided in response to queries performed on data managed by a data management system. In the event that one or more leaf nodes are unable or unavailable to process a query, a final query result provided in response to the original query may be missing data that exists on those leaf nodes. A data accounting service monitors what managed data is being stored on the leaf nodes and on what leaf node. The data accounting service can estimate how much data is missing from a final query result when one or more of the leaf nodes are unable or unavailable to process a query.

    Abstract translation: 这里提供的技术允许估计在响应于对由数据管理系统管理的数据执行的查询而提供的查询结果中丢失的数据。 在一个或多个叶节点不能或不可用于处理查询的情况下,响应于原始查询提供的最终查询结果可能是丢失那些叶节点上存在的数据。 数据记帐服务监视叶节点上以及叶节点上存储的托管数据。 当一个或多个叶节点不能或不可用来处理查询时,数据记帐服务可以从最终查询结果中估计丢失多少数据。

Patent Agency Ranking