Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Weibao Gong"

1.

发明申请
METHOD FOR OPTIMIZING PERFORMANCE OF MODEL TRAINING DEVICE, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20250103959A1

公开(公告)日：2025-03-27

申请号：US18885339

申请日：2024-09-13

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Liang Shen , Dianhai Yu , Weibao Gong , Jinle Zeng , Haifeng Wang

IPC: G06N20/00

Abstract: Provided is a performance optimization method for a model training device, an electronic device, and a storage medium, relating to the fields of deep learning, large model training, and distributed parallel strategies. The method includes: determining communication timing of a current model training device with respect to a target model block at a target sorting position, so as to be able to perform synchronously collective communication with other model training devices of a plurality of model training devices with respect to model blocks at the target sorting position; and performing the collective communication on a backward gradient of the target model block at the communication timing.

2.

发明申请
MODEL OPERATOR PROCESSING METHOD AND DEVICE, ELECTRONIC EQUIPMENT AND STORAGE MEDIUM 有权

公开(公告)号：US20250139327A1

公开(公告)日：2025-05-01

申请号：US18895722

申请日：2024-09-25

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Liang Shen , Jinle Zeng , Hongxiang Hao , Weibao Gong , Dianhai Yu , Haifeng Wang

IPC: G06F30/20

Abstract: A method for processing a model operator includes: determining an operator set for model networking, wherein the operator set comprises a plurality of operators; determining a storage amount occupied by an output tensor of each operator in the operator set and a computation time period consumed in a forward computation of each operator in the operator set; and determining a first operator participating in recomputation in a model from the operator set, based on the storage amounts and the computation time periods of the plurality of operators.

3.

发明申请
CLUSTER-BASED TRAINING METHODS, DEVICES, ELECTRONIC EQUIPMENT AND STORAGE MEDIA 有权

公开(公告)号：US20250029010A1

公开(公告)日：2025-01-23

申请号：US18895264

申请日：2024-09-24

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Dianhai Yu , Gexiao Tian , Weibao Gong , Haifeng Wang , Yongsheng Xu , Jiabin Yang

IPC: G06N20/00

Abstract: A cluster-based training method includes: in response to a hardware fault in the training node, selecting a target standby node from the plurality of standby nodes, and obtaining a target training snapshot of the model training task in the training node, in which the target training snapshot includes training state data of the model training task; and initializing the target standby node based on a container image of a model training program in the training node and the training state data to replace the training node with the target standby node to continue executing the model training task.

Patent Agency Ranking