-
1.
公开(公告)号:US20130152088A1
公开(公告)日:2013-06-13
申请号:US13324202
申请日:2011-12-13
Applicant: Christos Gkantsidis , Dimitrios Vytiniotis , Orion Hodson , Dushyanth Narayanan , Antony Rowstron
Inventor: Christos Gkantsidis , Dimitrios Vytiniotis , Orion Hodson , Dushyanth Narayanan , Antony Rowstron
CPC classification number: G06F17/30386 , G06F9/54 , G06F2209/542
Abstract: Methods of generating filters automatically from data processing jobs are described. In an embodiment, these filters are automatically generated from a compiled version of the data processing job using static analysis which is applied to a high-level representation of the job. The executable filter is arranged to suppress rows and/or columns within the data to which the job is applied and which do not affect the output of the job. The filters are generated by a filter generator and then stored and applied dynamically at a filtering proxy that may be co-located with the storage node that holds the data. In another embodiment, the filtered data may be cached close to a compute node which runs the job and data may be provided to the compute node from the local cache rather than from the filtering proxy.
Abstract translation: 描述从数据处理作业自动生成过滤器的方法。 在一个实施例中,这些过滤器使用静态分析从应用于作业的高级表示的数据处理作业的编译版本自动生成。 可执行过滤器被设置为抑制作业所应用的数据内的行和/或列,并且不影响作业的输出。 过滤器由过滤器生成器生成,然后在过滤代理处动态存储和应用,过滤代理可能与保存数据的存储节点位于同一位置。 在另一个实施例中,经过过滤的数据可以被缓存在运行作业的计算节点附近,并且可以从本地高速缓存而不是从过滤代理将数据提供给计算节点。
-
公开(公告)号:US09817860B2
公开(公告)日:2017-11-14
申请号:US13324202
申请日:2011-12-13
Applicant: Christos Gkantsidis , Dimitrios Vytiniotis , Orion Hodson , Dushyanth Narayanan , Antony Rowstron
Inventor: Christos Gkantsidis , Dimitrios Vytiniotis , Orion Hodson , Dushyanth Narayanan , Antony Rowstron
CPC classification number: G06F17/30386 , G06F9/54 , G06F2209/542
Abstract: Methods of generating filters automatically from data processing jobs are described. In an embodiment, these filters are automatically generated from a compiled version of the data processing job using static analysis which is applied to a high-level representation of the job. The executable filter is arranged to suppress rows and/or columns within the data to which the job is applied and which do not affect the output of the job. The filters are generated by a filter generator and then stored and applied dynamically at a filtering proxy that may be co-located with the storage node that holds the data. In another embodiment, the filtered data may be cached close to a compute node which runs the job and data may be provided to the compute node from the local cache rather than from the filtering proxy.
-