-
公开(公告)号:US11620187B2
公开(公告)日:2023-04-04
申请号:US17445401
申请日:2021-08-18
Applicant: Google LLC
Inventor: Robert Cypher , Sean Quinlan , Steven Robert Schirripa
IPC: H04L12/00 , G06F11/14 , G06F16/182 , G06F16/27 , G06F16/174
Abstract: A method of distributing data in a distributed storage system includes receiving a file, dividing the received file into chunks, and determining a distribution of the chunks among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes maintenance levels, and each maintenance level includes one or more maintenance units. Each maintenance unit has an active state and an inactive state. Moreover, each storage device is associated with a maintenance unit. The determining of the distribution of the chunks includes identifying a random selection of the storage devices matching a number of chunks of the file and being capable of maintaining accessibility of the file when one or more maintenance units are in an inactive state. The method also includes distributing the chunks to storage devices of the distributed storage system according to the determined distribution.
-
公开(公告)号:US10931592B1
公开(公告)日:2021-02-23
申请号:US16377607
申请日:2019-04-08
Applicant: Google LLC
Inventor: Lawrence E. Greenfield , Sean Quinlan , Priyanka Gupta
IPC: H04L12/26 , H04L29/08 , H04L12/923 , H04L12/911 , G06F15/16 , G06F15/167 , G05B13/02 , G02B15/02
Abstract: The present disclosure relates to dynamically scheduling resource requests in a distributed system based on usage quotas. One example method includes identifying usage information for a distributed system including atoms, each atom representing a distinct item used by users of the distributed system; determining that a usage quota associated with the distributed system has been exceeded based on the usage information, the usage quota representing an upper limit for a particular type of usage of the distributed system; receiving a first request for a particular atom requiring invocation of the particular type of usage represented by the usage quota; determining that a second request for a different type of usage of the particular atom is waiting to be processed; and processing the second request for the particular atom before processing the first request.
-
公开(公告)号:US10257111B1
公开(公告)日:2019-04-09
申请号:US15689640
申请日:2017-08-29
Applicant: Google LLC
Inventor: Lawrence E. Greenfield , Sean Quinlan , Priyanka Gupta
IPC: G06F15/173 , G06F17/30 , G06F15/16 , H04L12/923 , H04L12/911
Abstract: The present disclosure relates to dynamically scheduling resource requests in a distributed system based on usage quotas. One example method includes identifying usage information for a distributed system including atoms, each atom representing a distinct item used by users of the distributed system; determining that a usage quota associated with the distributed system has been exceeded based on the usage information, the usage quota representing an upper limit for a particular type of usage of the distributed system; receiving a first request for a particular atom requiring invocation of the particular type of usage represented by the usage quota; determining that a second request for a different type of usage of the particular atom is waiting to be processed; and processing the second request for the particular atom before processing the first request.
-
公开(公告)号:US10042881B1
公开(公告)日:2018-08-07
申请号:US15358428
申请日:2016-11-22
Applicant: Google LLC
Inventor: Wilson Cheng-Yi Hsieh , Alexander Lloyd , Peter Hochschild , Michael James Boyer Epstein , Sean Quinlan
Abstract: The present technology proposes techniques for ensuring globally consistent transactions. This technology may allow distributed systems to ensure the causal order of read and write transactions across different partitions of a distributed database. By assigning causally generated timestamps to the transactions based on one or more globally coherent time services, the timestamps can be used to preserve and represent the causal order of the transactions in the distributed system. In this regard, certain transactions may wait for a period of time after choosing a timestamp in order to delay the start of any second transaction that might depend on it. The wait may ensure that the effects of the first transaction are not made visible until its timestamp is guaranteed to be in the past. This may ensure that a consistent snapshot of the distributed database can be determined for any past timestamp.
-
公开(公告)号:US20240338279A1
公开(公告)日:2024-10-10
申请号:US18746351
申请日:2024-06-18
Applicant: Google LLC
Inventor: Robert Cypher , Sean Quinlan , Steven Robert Schirripa
IPC: G06F11/14 , G06F16/174 , G06F16/182 , G06F16/27
CPC classification number: G06F11/1435 , G06F16/1748 , G06F16/182 , G06F16/278
Abstract: A method of distributing data in a distributed storage system includes receiving a file, dividing the received file into chunks, and determining a distribution of the chunks among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes maintenance levels, and each maintenance level includes one or more maintenance units. Each maintenance unit has an active state and an inactive state. Moreover, each storage device is associated with a maintenance unit. The determining of the distribution of the chunks includes identifying a random selection of the storage devices matching a number of chunks of the file and being capable of maintaining accessibility of the file when one or more maintenance units are in an inactive state. The method also includes distributing the chunks to storage devices of the distributed storage system according to the determined distribution.
-
公开(公告)号:US20230325378A1
公开(公告)日:2023-10-12
申请号:US17716093
申请日:2022-04-08
Applicant: Google LLC
Inventor: Lavina Jain , Sean Quinlan
CPC classification number: G06F16/2365 , G06F16/2322 , G06F16/273
Abstract: Generally disclosed herein is an approach to migrate data from a first type of distributed system to a second type of distributed system without locking data, where transactional dual writes are not available across the two systems. The approach starts by setting up a bi-directional replication between the first system and the second system. The first system will initially operate as a primary system, where the primary system receives and serves write requests from clients or other devices. For each write to the first system, the second system is updated with an asynchronous write. When the second system is caught up to the first system, such that both the first and second systems reflect approximately the same data, the second system can be switched over to serve as the primary system. The second system can now directly receive and serve all future read and write requests.
-
公开(公告)号:US20220171781A1
公开(公告)日:2022-06-02
申请号:US17673049
申请日:2022-02-16
Applicant: Google LLC
Inventor: Robert C. Pike , Sean Quinlan , Sean M. Dorward , Jeffrey Dean , Sanjay Ghemawat
IPC: G06F16/2455 , G06F16/28 , G06F16/2458 , G06F11/14 , G06F16/18
Abstract: Systems and methods for analyzing input data records are provided in which a master process initiates a plurality of concurrent first processes each of which comprises, for each data record in at least a subset of a plurality of input data records, creating a parsed representation of the data record and independently applying a procedural language query to the parsed representation to extract one or more values. A respective emit operator is applied to at least one of the extracted one or more values thereby adding corresponding information to a respective intermediate data structure. The respective emit operator implements one of a predefined set of statistical information processing functions. The master process also initiates a plurality of second processes each of which aggregates information from a corresponding subset of intermediate data structures to produce aggregated data that is, in turn, combined to produce output data.
-
公开(公告)号:US20210382790A1
公开(公告)日:2021-12-09
申请号:US17445401
申请日:2021-08-18
Applicant: Google LLC
Inventor: Robert Cypher , Sean Quinlan , Steven Robert Schirripa
IPC: G06F11/14 , G06F16/182 , G06F16/27 , G06F16/174
Abstract: A method of distributing data in a distributed storage system includes receiving a file, dividing the received file into chunks, and determining a distribution of the chunks among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes maintenance levels, and each maintenance level includes one or more maintenance units. Each maintenance unit has an active state and an inactive state. Moreover, each storage device is associated with a maintenance unit. The determining of the distribution of the chunks includes identifying a random selection of the storage devices matching a number of chunks of the file and being capable of maintaining accessibility of the file when one or more maintenance units are in an inactive state. The method also includes distributing the chunks to storage devices of the distributed storage system according to the determined distribution.
-
公开(公告)号:US20180052890A1
公开(公告)日:2018-02-22
申请号:US15799939
申请日:2017-10-31
Applicant: GOOGLE LLC
Inventor: Robert C. Pike , Sean Quinlan , Sean M. Dorward , Jeffrey Dean , Sanjay Ghemawat
CPC classification number: G06F16/24561 , G06F11/1482 , G06F16/2471 , G06F16/285 , Y10S707/99933 , Y10S707/99937
Abstract: Systems and methods for analyzing input data records are provided in which a master process initiates a plurality of concurrent first processes each of which comprises, for each data record in at least a subset of a plurality of input data records, creating a parsed representation of the data record and independently applying a procedural language query to the parsed representation to extract one or more values. A respective emit operator is applied to at least one of the extracted one or more values thereby adding corresponding information to a respective intermediate data structure. The respective emit operator implements one of a predefined set of statistical information processing functions. The master process also initiates a plurality of second processes each of which aggregates information from a corresponding subset of intermediate data structures to produce aggregated data that is, in turn, combined to produce output data.
-
公开(公告)号:US12019519B2
公开(公告)日:2024-06-25
申请号:US18191371
申请日:2023-03-28
Applicant: Google LLC
Inventor: Robert Cypher , Sean Quinlan , Steven Robert Schirripa
IPC: H04L12/00 , G06F11/14 , G06F16/174 , G06F16/182 , G06F16/27
CPC classification number: G06F11/1435 , G06F16/1748 , G06F16/182 , G06F16/278
Abstract: A method of distributing data in a distributed storage system includes receiving a file, dividing the received file into chunks, and determining a distribution of the chunks among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes maintenance levels, and each maintenance level includes one or more maintenance units. Each maintenance unit has an active state and an inactive state. Moreover, each storage device is associated with a maintenance unit. The determining of the distribution of the chunks includes identifying a random selection of the storage devices matching a number of chunks of the file and being capable of maintaining accessibility of the file when one or more maintenance units are in an inactive state. The method also includes distributing the chunks to storage devices of the distributed storage system according to the determined distribution.
-
-
-
-
-
-
-
-
-