-
1.
公开(公告)号:US20210343410A1
公开(公告)日:2021-11-04
申请号:US16865335
申请日:2020-05-02
申请人: Petuum Inc.
发明人: Shanghang Zhang , Najmeh Sadoughi , Pengtao Xie , Eric Xing
IPC分类号: G16H50/20 , G06F40/30 , G16H50/70 , G16H10/60 , G16H70/60 , G16H40/20 , G06N3/04 , G06N3/08 , G06F16/22
摘要: The present invention is a system and a method to classify clinical records into International Classification of Diseases (ICD) codes. The system includes a processor, and a memory communicatively coupled to the processor. The memory includes a generator (G), a feature extractor, a discriminator (D), a label encoder, and a keywords reconstructor. The generator (G) generates synthesized features corresponding to ICD code descriptions. The feature extractor extracts real latent features from clinical documents and generates real features by training a GANs. The generator (G) generates synthesized features after the GANs are trained and calibrate a binary code classifier with the real latent features generated by the feature extractor for a low-shot ICD code l. The feature extractor generates code-specific latent features conditioned on a textual description of each ICD code description by using a WGAN-GP. The discriminator (D) distinguishes between the synthesized features and the real features and determines whether the features are the real features or synthetic features. The label encoder encodes a sequence of keywords in the ICD code description into a sequence of hidden states.