INFERRING A DATASET SCHEMA FROM INPUT FILES
    1.
    发明公开

    公开(公告)号:US20240184754A1

    公开(公告)日:2024-06-06

    申请号:US18438301

    申请日:2024-02-09

    发明人: Nir Ackner Eric Lin

    IPC分类号: G06F16/21 G06F3/06 G06F40/205

    摘要: A method comprises selecting a sample excerpt from a data input file; in response to the determining that a first row in the sample excerpt does not contain a delimited value and a second row does contain a delimited value, determining that the first row consists of header data; identifying one or more jagged rows based on row delimiters that were erroneously placed; causing displaying text that led to creation of a jagged row; receiving an addition or removal of a specific row delimiter to the text; updating the sample excerpt based on the addition or the removal; analyzing the sample excerpt to determine a row delimiter for the data input file; identifying a plurality of rows that is not included in the header data; identifying a plurality of candidate column delimiters and generating a candidate schema for the data input file.

    Inferring a dataset schema from input files

    公开(公告)号:US10204119B1

    公开(公告)日:2019-02-12

    申请号:US15654952

    申请日:2017-07-20

    发明人: Nir Ackner Eric Lin

    IPC分类号: G06F17/30 G06F17/27 G06F3/06

    摘要: Techniques for generating a schema for a data input file are described herein. In an embodiment, a server computer receives a data input file. The server computer system selects a sample excerpt from the data input which comprises a subset of the data input file. The server computer system analyzes the sample excerpt to determine a row delimiter for the data input file, a column delimiter for the data input file, and a plurality of data format types. Using the column delimiter, row delimiter, and plurality of data format types, the server computer system generates a candidate schema for the data input file.

    INFERRING A DATASET SCHEMA FROM INPUT FILES
    3.
    发明申请

    公开(公告)号:US20200159704A1

    公开(公告)日:2020-05-21

    申请号:US16748351

    申请日:2020-01-21

    发明人: Nir Ackner Eric Lin

    IPC分类号: G06F16/21 G06F3/06 G06F40/205

    摘要: Techniques for generating a schema for a data input file are described herein. In an embodiment, a server computer receives a data input file. The server computer system selects a sample excerpt from the data input which comprises a subset of the data input file. The server computer system analyzes the sample excerpt to determine a row delimiter for the data input file, a column delimiter for the data input file, and a plurality of data format types. Using the column delimiter, row delimiter, and plurality of data format types, the server computer system generates a candidate schema for the data input file.

    Inferring a dataset schema from input files

    公开(公告)号:US11907181B2

    公开(公告)日:2024-02-20

    申请号:US16748351

    申请日:2020-01-21

    发明人: Nir Ackner Eric Lin

    IPC分类号: G06F16/21 G06F3/06 G06F40/205

    摘要: Techniques for generating a schema for a data input file are described herein. In an embodiment, a server computer receives a data input file. The server computer system selects a sample excerpt from the data input which comprises a subset of the data input file. The server computer system analyzes the sample excerpt to determine a row delimiter for the data input file, a column delimiter for the data input file, and a plurality of data format types. Using the column delimiter, row delimiter, and plurality of data format types, the server computer system generates a candidate schema for the data input file.

    INFERRING A DATASET SCHEMA FROM INPUT FILES
    5.
    发明申请

    公开(公告)号:US20190108244A1

    公开(公告)日:2019-04-11

    申请号:US16210984

    申请日:2018-12-05

    发明人: Nir Ackner Eric Lin

    IPC分类号: G06F17/30 G06F3/06 G06F17/27

    摘要: Techniques for generating a schema for a data input file are described herein. In an embodiment, a server computer receives a data input file. The server computer system selects a sample excerpt from the data input which comprises a subset of the data input file. The server computer system analyzes the sample excerpt to determine a row delimiter for the data input file, a column delimiter for the data input file, and a plurality of data format types. Using the column delimiter, row delimiter, and plurality of data format types, the server computer system generates a candidate schema for the data input file.

    Search around visual queries
    6.
    发明授权
    Search around visual queries 有权
    搜索视觉查询

    公开(公告)号:US09031981B1

    公开(公告)日:2015-05-12

    申请号:US13767779

    申请日:2013-02-14

    IPC分类号: G06F17/30

    摘要: A method and apparatus for a data analysis system for analyzing data object collections is provided. The data analysis system includes one or more graphical user interfaces comprising various interface elements that enable users to create visual queries. A visual query is constructed as a graph representing a pattern of interest in a collection of data objects. A visual query may include one or more graph elements and property information associated with the specified graph elements. After a user has constructed a visual query, the system may transform the visual query into a query template. A query engine may then execute the query template to search a data object collection for data object results corresponding to the specified pattern. The search for instances of a specified pattern in a collection of data objects is referred herein to as a “search around.”

    摘要翻译: 提供了一种用于分析数据对象集合的数据分析系统的方法和装置。 数据分析系统包括一个或多个图形用户界面,其包括使用户能够创建视觉查询的各种界面元素。 视觉查询构造为表示数据对象集合中感兴趣的模式的图形。 视觉查询可以包括与指定的图形元素相关联的一个或多个图形元素和属性信息。 在用户构建可视化查询之后,系统可以将视觉查询转换为查询模板。 然后,查询引擎可以执行查询模板以搜索对应于指定模式的数据对象结果的数据对象集合。 在数据对象集合中搜索指定模式的实例在本文中被称为“搜索周围”。

    Search around visual queries
    7.
    发明授权

    公开(公告)号:US10585883B2

    公开(公告)日:2020-03-10

    申请号:US15730634

    申请日:2017-10-11

    摘要: A method and apparatus for a data analysis system for analyzing data object collections is provided. The data analysis system includes one or more graphical user interfaces comprising various interface elements that enable users to create visual queries. A visual query is constructed as a graph representing a pattern of interest in a collection of data objects. A visual query may include one or more graph elements and property information associated with the specified graph elements. After a user has constructed a visual query, the system may transform the visual query into a query template. A query engine may then execute the query template to search a data object collection for data object results corresponding to the specified pattern. The search for instances of a specified pattern in a collection of data objects is referred herein to as a “search around.”

    Inferring a dataset schema from input files

    公开(公告)号:US10540333B2

    公开(公告)日:2020-01-21

    申请号:US16210984

    申请日:2018-12-05

    发明人: Nir Ackner Eric Lin

    摘要: Techniques for generating a schema for a data input file are described herein. In an embodiment, a server computer receives a data input file. The server computer system selects a sample excerpt from the data input which comprises a subset of the data input file. The server computer system analyzes the sample excerpt to determine a row delimiter for the data input file, a column delimiter for the data input file, and a plurality of data format types. Using the column delimiter, row delimiter, and plurality of data format types, the server computer system generates a candidate schema for the data input file.

    SEARCH AROUND VISUAL QUERIES
    9.
    发明申请

    公开(公告)号:US20180032571A1

    公开(公告)日:2018-02-01

    申请号:US15730634

    申请日:2017-10-11

    IPC分类号: G06F17/30

    摘要: A method and apparatus for a data analysis system for analyzing data object collections is provided. The data analysis system includes one or more graphical user interfaces comprising various interface elements that enable users to create visual queries. A visual query is constructed as a graph representing a pattern of interest in a collection of data objects. A visual query may include one or more graph elements and property information associated with the specified graph elements. After a user has constructed a visual query, the system may transform the visual query into a query template. A query engine may then execute the query template to search a data object collection for data object results corresponding to the specified pattern. The search for instances of a specified pattern in a collection of data objects is referred herein to as a “search around.”