-
公开(公告)号:US20230035351A1
公开(公告)日:2023-02-02
申请号:US17950521
申请日:2022-09-22
发明人: Hwidong NA , Hyohyeong KANG , Hogyeong KIM , Hoshik LEE
IPC分类号: G06N3/08
摘要: A model training method and apparatus is disclosed, where the model training method acquires a recognition result of a teacher model and a recognition result of a student model for an input sequence and trains the student model such that the recognition result of the teacher model and the recognition result of the student model are not distinguished from each other.
-
公开(公告)号:US20220319500A1
公开(公告)日:2022-10-06
申请号:US17425211
申请日:2021-07-08
发明人: Taewoo LEE , Taegyoon KANG , Hogyeong KIM , Minjoong LEE , Seokyeong JUNG , Jiseung JEONG
摘要: Disclosed is an electronic device including processor and memory operatively connected to the processor and storing language model. The electronic device may enter data into the language model, generate an embedding vector in the input embedding layer, add position information to the embedding vector in the positional encoding layer, branch the embedding vector based on domain information, normalize the branched embedding vectors, enter the normalized embedding vectors into the multi-head attention layer, enter output data of the multi-head attention layer into the first layer, normalize pieces of output data of the first layer, enter the normalized pieces of output data of the first layer into the feed-forward layer, enter output data of the feed-forward layer into the second layer and normalize pieces of output data of the second layer, and enter the normalized pieces of output data of the second layer into the linearization layer and the softmax layer to obtain result data. In addition, various embodiments as understood from the specification may be also possible.
-