Storage-Side Scanning on Non-Natively Formatted Data
    18.
    发明申请
    Storage-Side Scanning on Non-Natively Formatted Data 审中-公开
    非本地格式化数据的存储侧扫描

    公开(公告)号:US20150356158A1

    公开(公告)日:2015-12-10

    申请号:US14733691

    申请日:2015-06-08

    Abstract: A storage system communicatively coupled to a DBMS performs storage-side scanning of data sources that are not stored in the native database storage format of the DBMS. Data sources for external tables are accessible in a storage system referred to herein as a distributed data access system, e.g. a Hadoop Distributed File System. To execute a query that references an external table, a DBMS first generates an execution plan. The distributed data access system supplies the DBMS with information that specifies each portion of the data source, and specifies which data node to use to access the portion. The DBMS sends a request for each portion to the respective data node, the request requesting that the data node generate rows from data in the portion. The request may specify scanning criteria, specifying one or more columns to project and/or filter on. The request may also specify code modules for the data node to execute to generate rows or records and columns.

    Abstract translation: 通信地耦合到DBMS的存储系统对不存储在DBMS的本地数据库存储格式的数据源执行存储侧扫描。 用于外部表的数据源可在本文称为分布式数据访问系统的存储系统中访问,例如, 一个Hadoop分布式文件系统。 要执行引用外部表的查询,DBMS首先生成执行计划。 分布式数据访问系统向DBMS提供指定数据源的每个部分的信息,并指定要用于访问该部分的数据节点。 DBMS向每个数据节点发送每个部分的请求,该请求请求数据节点从该部分中的数据生成行。 请求可以指定扫描条件,指定一个或多个列进行投影和/或过滤。 该请求还可以指定用于数据节点执行的代码模块以生成行或记录和列。

Patent Agency Ranking