-
1.
公开(公告)号:US20220374459A1
公开(公告)日:2022-11-24
申请号:US17533613
申请日:2021-11-23
Applicant: salesforce.com, inc.
Inventor: Ye Liu , Kazuma Hashimoto , Yingbo Zhou , Semih Yavuz , Caiming Xiong
IPC: G06F16/335 , G06F16/332 , G06F16/31
Abstract: Embodiments described herein provide a dense hierarchical retrieval for open-domain question and answering for a corpus of documents using a document-level and passage-level dense retrieval model. Specifically, each document is viewed as a structural collection that has sections, subsections and paragraphs. Each document may be split into short length passages, where a document-level retrieval model and a passage-level retrieval model may be applied to return a smaller set of filtered texts. Top documents may be identified after encoding the question and the documents and determining document relevance scores to the encoded question. Thereafter, a set of top passages are further identified based on encoding of the passages and determining passage relevance scores to the encoded question. The document and passage relevance scores may be used in combination to determine a final retrieval ranking for the documents having the set of top passages.