- 专利标题: Continual text recognition using prompt-guided knowledge distillation
-
申请号: US18389641申请日: 2023-12-19
-
公开(公告)号: US12033408B1公开(公告)日: 2024-07-09
- 发明人: Ankit Malviya , Shubhanshu Kumar Singh , Vishu Mittal , Anish Goswami , Chaithanya Manda , Saurabh Khanna , Sarika Pal
- 申请人: ExlService Holdings, Inc.
- 申请人地址: US NY New York
- 专利权人: ExlService Holdings, Inc.
- 当前专利权人: ExlService Holdings, Inc.
- 当前专利权人地址: US NY New York
- 代理机构: Perkins Coie LLP
- 主分类号: G06V30/14
- IPC分类号: G06V30/14
摘要:
A text recognition system receives a prompt and, based on the prompt, causes a trained region encoder to determine a first region of interest of an image file. The system modifies a first image associated with the first region of interest (e.g., parsed out from the first region) to generate a data augmentation entity that includes a modified image. Using a trained instance encoder, the system generates a first set of visual instances corresponding to the first region of interest image and a second set of visual instances corresponding to the data augmentation entity. The system generates the corresponding first and second sequences. By executing a self-supervised contrastive loss function on the first and second sequences, the system automatically updates a continual knowledge distillation model of the trained region encoder. The system provides the first sequence to an instance decoder to generate output text in response to the prompt.
信息查询