-
11.
公开(公告)号:US20220129638A1
公开(公告)日:2022-04-28
申请号:US17078569
申请日:2020-10-23
Applicant: Google LLC
Inventor: Liu Yang , Marc Najork , Michael Bendersky , Mingyang Zhang , Cheng Li
IPC: G06F40/30 , G06F40/205 , G06N3/04 , G06N3/08
Abstract: Systems and methods of the present disclosure are directed to a method for predicting semantic similarity between documents. The method can include obtaining a first document and a second document. The method can include parsing the first document into a plurality of first textual blocks and the second document into a plurality of second textual blocks. The method can include processing each of the plurality of first textual blocks and the second textual blocks with a machine-learned semantic document encoding model to obtain a first document encoding and a second document encoding. The method can include determining a similarity metric descriptive of a semantic similarity between the first document and the second document based on the first document encoding and the second document encoding.