Data compression for columnar databases into arbitrarily-sized persistent pages

    公开(公告)号:US12119845B2

    公开(公告)日:2024-10-15

    申请号:US18483824

    申请日:2023-10-10

    申请人: SAP SE

    发明人: Ivan Schreter

    IPC分类号: H03M7/30 G06F16/22 H03M7/40

    摘要: A method for compressing columnar data may include generating, for a data column included in a data chunk, a dictionary enumerating, in a sorted order, a first set of unique values included in the first data column. A compression technique for generated a compressed representation of the data column having a fewest quantity of bytes may be identified based at least on the dictionary. The compression technique including a dictionary compression applying the dictionary and/or another compression technique. A compressed data chunk may be generated by applying the compression technique to compress the data column included in the data chunk. The compressed data chunk may be stored at a database in a variable-size persistent page whose size is allocated based on the size of the compressed representation of the data column. Related systems and articles of manufacture are also provided.