ASYNCHRONOUS PREDICTION IN MACHINE LEARNING MODEL SERVING

    公开(公告)号:US20240095109A1

    公开(公告)日:2024-03-21

    申请号:US17945960

    申请日:2022-09-15

    申请人: Twitter, Inc.

    发明人: Lawrence Lam Di Zhao

    IPC分类号: G06F9/54 G06N20/00

    CPC分类号: G06F9/547 G06N20/00

    摘要: Example computer-implemented methods, media, and systems for serving machine learning (ML) models using an asynchronous input/output (I/O) mechanism are disclosed. One example method includes receiving a first request for running a ML model to provide a first prediction. A first green thread is generated responsive to the first request and executed on an operating system (OS) thread to send a first asynchronous remote procedure call (RPC) to a multiple producer single consumer (MPSC) channel. A second request for running the ML model to provide a second prediction is received. A second green thread is generated responsive to the second request and executed on the OS thread to send a second RPC to the MPSC channel. The first and the second asynchronous RPCs are scheduled using a first and a second blocking threads respectively, which are used by the ML model to generate the first prediction and the second prediction.