-
公开(公告)号:US20250148362A1
公开(公告)日:2025-05-08
申请号:US18678634
申请日:2024-05-30
Inventor: Myung-Hoon CHA , Ki-Dong KANG , Hong-Yeon KIM , Baik-Song AN
IPC: G06N20/00
Abstract: Disclosed herein is an apparatus and method for managing a giant model. The apparatus includes memory in which at least one program is recorded and a processor for executing the program. The program may perform lightweighting a first model into a second model in consideration of hardware resources, generating partitioning information of the first model based on a result of analysis of the second model, and performing training or inference for the first model based on the generated partitioning information.
-
2.
公开(公告)号:US20240176759A1
公开(公告)日:2024-05-30
申请号:US18521396
申请日:2023-11-28
Inventor: Baik-Song AN , Ki-Dong KANG , Hong-Yeon KIM , Myung-Hoon CHA
Abstract: Disclosed herein are a method for machine-learning parallelization using host CPUs of a multi-socket structure and an apparatus therefor. The method, performed by the apparatus for machine-learning parallelization using host CPUs of a multi-socket structure, includes a compile phase in which a learning model is split at a layer level for respective pipeline stages and allocated to Non-Uniform Memory Access (NUMA) nodes for respective CPU sockets and a runtime phase in which parameters required for learning are initialized and multiple threads generated in consideration of a policy of each parallelism algorithm are executed by being allocated to respective cores included in the NUMA node.
-
3.
公开(公告)号:US20240176756A1
公开(公告)日:2024-05-30
申请号:US18345083
申请日:2023-06-30
Inventor: Ki-Dong KANG , Hong-Yeon KIM , Baik-Song AN , Myung-Hoon CHA
IPC: G06F13/362 , G06F9/38 , G06F9/48
CPC classification number: G06F13/3625 , G06F9/3885 , G06F9/4881
Abstract: Disclosed herein is a method for distributed training of an AI model in a channel-sharing network environment. The method includes determining whether data parallel processing is applied, calculating a computation time and a communication time when input data is evenly distributed across multiple computation devices, and unevenly distributing the input data across the multiple computation devices based on the computation time and the communication time.
-
-