-
公开(公告)号:US12164966B1
公开(公告)日:2024-12-10
申请号:US18351388
申请日:2023-07-12
Applicant: Snowflake Inc.
Inventor: Ganeshan Ramachandran Iyer , Raghav Ramachandran , Yang Wang
Abstract: A system and method of dynamic task allocation and warehouse scaling. The method includes receiving a request to process a task. The method includes monitoring a plurality of execution nodes of a datastore to determine a plurality of central processing unit (CPU) utilizations. Each CPU utilization of the plurality of CPU utilizations is associated with a respective execution node of the plurality of execution nodes. The method includes identifying, by a processing device based on the plurality of CPU utilizations, a particular execution node associated with a maximum CPU utilization to process the task. The method includes allocating the task to the particular execution node.
-
公开(公告)号:US20240338521A1
公开(公告)日:2024-10-10
申请号:US18629693
申请日:2024-04-08
Applicant: Snowflake Inc.
Inventor: Michal Gdak , Ganeshan Ramachandran Iyer , Tomasz Malisz , Mikolaj Niedbala , Pawel Pollak , Saurin Shah , Jan Tomasz Topinski
IPC: G06F40/226
CPC classification number: G06F40/226
Abstract: Systems and methods for: processing a current electronic document, using a set of machine-learning (ML) models, to extract a set of values for a set of data points based on a schema, where the schema describes the set of data points to be extracted from electronic documents; determining whether to select the current electronic document for human validation based on the schema; and adding the current electronic document to a human validation queue in response to determining to select the current electronic document for human validation based on the schema.
-
公开(公告)号:US12007961B2
公开(公告)日:2024-06-11
申请号:US18345987
申请日:2023-06-30
Applicant: Snowflake Inc.
Inventor: Istvan Cseri , Benoit Dageville , Ganeshan Ramachandran Iyer , Yucan Liu , Jiaqi Yan
CPC classification number: G06F16/211
Abstract: Techniques for schema mismatch detection and evolution are described. When data is being uploaded into a source table, schema of the data to be uploaded can be compared with the schema for the source table. If a schema mismatch is detected, the schema of the source table can be modified, and the upload can be continued without data loss.
-
公开(公告)号:US20240143548A1
公开(公告)日:2024-05-02
申请号:US18050122
申请日:2022-10-27
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Thierry Cruanes , Benoit Dageville , Ganeshan Ramachandran Iyer , Subramanian Muralidhar
CPC classification number: G06F16/148 , G06F9/5022 , G06F16/116
Abstract: Techniques for continuous ingestion of files using custom file formats are described. A custom file format may include formats not natively supported by a data system. Unstructured files (e.g., images) may also be considered custom file formats. A custom file format may be set using a user defined table function and scanner options.
-
公开(公告)号:US11586515B1
公开(公告)日:2023-02-21
申请号:US17663941
申请日:2022-05-18
Applicant: Snowflake Inc.
Inventor: Abdullah Al Mahmood , Ruta Dhaneshwar , Xin Huang , Ganeshan Ramachandran Iyer , Jiaxing Liang , Nithin Mahesh , Raghav Ramachandran , Purav B. Saraiya , Yanyi Zhang
Abstract: Described herein are techniques for improving disaster recovery, in particular disaster recovery pertaining to data transfer requests. The data transfer request can be received by each of multiple deployments; however, only a primary deployment can process the request. The data transferred by the primary deployment may be replicated in the secondary deployments. In response to a failover event, one of the secondary deployments can be designated as the new primary development and continue the data transfer based on the data transfer request and the replication information received from the old primary deployment prior to the failover.
-
公开(公告)号:US12236355B2
公开(公告)日:2025-02-25
申请号:US18416379
申请日:2024-01-18
Applicant: Snowflake Inc.
Inventor: Michal Gdak , Ganeshan Ramachandran Iyer , Tomasz Malisz , Mikolaj Niedbala , Pawel Pollak , Saurin Shah , Jan Tomasz Topinski , Daria Wieteska
Abstract: Systems and methods for generating a machine-learning (ML) model for extracting information from one or more electronic documents, where the ML model can be used as a data object, which can be part of a database command or as part of a document information extraction process that is continuously running (e.g., document information extraction pipeline).
-
公开(公告)号:US20240411651A1
公开(公告)日:2024-12-12
申请号:US18810853
申请日:2024-08-21
Applicant: Snowflake Inc.
Inventor: Abdullah Al Mahmood , Ruta Dhaneshwar , Xin Huang , Ganeshan Ramachandran Iyer , Jiaxing Liang , Nithin Mahesh , Raghav Ramachandran , Purav B. Saraiya , Yanyi Zhang
Abstract: Described herein are techniques for improving disaster recovery, in particular disaster recovery pertaining to data transfer requests. The data transfer request can be received by each of multiple deployments; however, only a primary deployment can process the request. The data transferred by the primary deployment may be replicated in the secondary deployments. In response to a failover event, one of the secondary deployments can be designated as the new primary development and continue the data transfer based on the data transfer request and the replication information received from the old primary deployment prior to the failover.
-
公开(公告)号:US11983165B1
公开(公告)日:2024-05-14
申请号:US18128212
申请日:2023-03-29
Applicant: Snowflake Inc.
Inventor: Abdullah Al Mahmood , Chong Han , Ganeshan Ramachandran Iyer , Jiaxing Liang , Nithin Mahesh , Yanrui Zhang
IPC: G06F16/23 , G06F16/174 , G06F16/27
CPC classification number: G06F16/2365 , G06F16/1748 , G06F16/27
Abstract: Embodiments of the present disclosure provide techniques for deduplicating files during internal stage replication using a directory table of the replicated internal stage that is modified as a cache for storing and retrieving original file-level metadata for the replicated files. An initial list of candidate files for loading from the internal stage to a table of the target deployment is prepared based on the files listed in the internal stage, and refined using a directory table lookup. If there is any inconsistency between the files registered in the directory table and the files listed in the internal stage, the target deployment will inspect the user-defined file-level metadata to obtain original file-level metadata for each file that is present in the internal stage but not in the directory table. This information may be used during deduplication to ensure that no duplicate files are loaded.
-
公开(公告)号:US11748318B1
公开(公告)日:2023-09-05
申请号:US18104253
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Istvan Cseri , Benoit Dageville , Ganeshan Ramachandran Iyer , Yucan Liu , Jiaqi Yan
CPC classification number: G06F16/211
Abstract: Techniques for schema mismatch detection and evolution are described. When data is being uploaded into a source table, schema of the data to be uploaded can be compared with the schema for the source table. If a schema mismatch is detected, the schema of the source table can be modified, and the upload can be continued without data loss.
-
公开(公告)号:US20230084682A1
公开(公告)日:2023-03-16
申请号:US17936770
申请日:2022-09-29
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Ganeshan Ramachandran Iyer , Isaac Kunen
IPC: G06F21/54 , G06F16/2455 , G06F21/60 , G06F21/53
Abstract: The logging techniques described herein can enable using logging tools without having to use different methods for sandbox implementations and push out the log data to storage without problems. The log data is treated as sensitive data and is protected according to the defined security policies. Further, the results may be compressed and encrypted.
-
-
-
-
-
-
-
-
-