-
公开(公告)号:US20220027766A1
公开(公告)日:2022-01-27
申请号:US17493365
申请日:2021-10-04
Inventor: Zhou FANG , Yabing SHI , Ye JIANG , Chunguang CHAI
Abstract: A method for an industry text increment, as well as an electronic device and a computer readable storage medium for the same are provided. The method may include: acquiring an original industry text in a target industry field, an order of magnitude of a number of the original industry text being smaller than a preset first order of magnitude; and performing a sample incremental processing on the original industry text by using a distant supervision method, to obtain increased industry texts, an order of magnitude of a number of the increased industry texts is greater than a preset second order of magnitude, wherein the preset second order of magnitude is not smaller than the preset first order of magnitude.
-
公开(公告)号:US20200250380A1
公开(公告)日:2020-08-06
申请号:US16779361
申请日:2020-01-31
Inventor: Zhaoyu WANG , Yabing SHI , Haijin LIANG , Ye JIANG , Yang ZHANG , Yong ZHU
Abstract: Embodiments of the present disclosure relate to a method, an apparatus and a device for constructing a data model, and a medium. The method for constructing the data model includes obtaining a first attribute set associated with an entity type. The method further includes aligning a plurality of attributes with a same semantics in the first attribute set to a same attribute, to generate a second attribute set associated with the entity type, attributes in the second attribute set having different semantics. The method further includes constructing the data model associated with the entity type based on the entity type and the second attribute set.
-
公开(公告)号:US20190197166A1
公开(公告)日:2019-06-27
申请号:US16164619
申请日:2018-10-18
Inventor: Yabing SHI , Chenglong XUE , Shuangjie LI , Haijin LIANG
IPC: G06F17/30
Abstract: The present disclosure provides a method, a terminal device and a storage medium for mining an entity description tag. The method includes: acquiring a group of one or more core words corresponding to each field and a first syntax dependent template corresponding to each core word; performing a matching on each data in a first data source by using the first syntax dependent template to determine a first description tag set in each field; performing a recognition on each data in a second data source to determine an entity set; determining a second description tag set based on a matching degree between each description tag in the description tag set of each field and each data in the second data source; and determining an entity description tag set based on a correlation between each entity in the entity set and each description tag in the second descriptive tag set.
-
公开(公告)号:US20210334669A1
公开(公告)日:2021-10-28
申请号:US17116979
申请日:2020-12-09
Inventor: Qian LI , Yabing SHI , Ye JIANG , Chunguang CHAI , Yong ZHU
Abstract: A method, apparatus, device, and storage medium for constructing a knowledge graph, relates to the field of data processing, and specifically to artificial intelligence technology is provided. The method may include: determining a scene and a scene element of the scene; determining a target tag from attribute tags based on an association relationship between an entity and the scene element, and an association relationship between the entity and each of the attribute tags; and establishing an edge between a scene node and a target tag node, to obtain a knowledge graph including scene information.
-
公开(公告)号:US20210256038A1
公开(公告)日:2021-08-19
申请号:US17249001
申请日:2021-02-17
Inventor: Yabing SHI , Shuangjie LI , Ye JIANG , Yang ZHANG , Yong ZHU
IPC: G06F16/28 , G06F40/295 , G06F40/30
Abstract: The disclosure discloses a method and an apparatus for recognizing an entity word. The method includes: obtaining an entity word category and a document to be recognized; generating an entity word question based on the entity word category; segmenting the document to be recognized to generate a plurality of candidate sentences; inputting the entity word question and the plurality of candidate sentences into a question-answer model trained in advance to obtain an entity word recognizing result; and obtaining an entity word set corresponding to the entity word question based on the entity word recognizing result.
-
公开(公告)号:US20210216882A1
公开(公告)日:2021-07-15
申请号:US17025952
申请日:2020-09-18
Inventor: Fang HUANG , Shuangjie LI , Yabing SHI , Ye JIANG , Yang ZHANG , Yong ZHU
Abstract: A method and apparatus for generating a temporal knowledge graph, a device and a medium. An embodiment of the method comprises: acquiring corpus including time information; performing multivariate data extraction on the corpus, multivariate data including an entity pair, an entity relationship and a target time interval of the entity relationship, the target time interval being used to indicate a valid period of the entity relationship; and generating a temporal knowledge graph based on the entity pair, the entity relationship and the target time interval of the entity relationship.
-
公开(公告)号:US20210217504A1
公开(公告)日:2021-07-15
申请号:US17023998
申请日:2020-09-17
Inventor: Zhou FANG , Shuangjie LI , Yabing SHI , Ye JIANG
Abstract: The present disclosure relates to the field of medical data processing based on natural language processing. Embodiments of the present disclosure disclose a method and apparatus for verifying a medical fact. The method may include: acquiring a description text of the medical fact; selecting a relevant paragraph related to the description text of the medical fact from a medical document; and inputting the description text of the medical fact and the corresponding relevant paragraph into a trained discrimination model for authenticity judgment, to obtain a verification result of the medical fact, the discrimination model being pre-trained based on a medical text paragraph pair extracted from the medical document, and being iteratively adjusted using a medical fact sample set including authenticity labeling information after the pre-training.
-
公开(公告)号:US20210216819A1
公开(公告)日:2021-07-15
申请号:US17149267
申请日:2021-01-14
Inventor: Wei HE , Shuangjie LI , Yabing SHI , Ye JIANG , Yang ZHANG , Yong ZHU
IPC: G06K9/62 , G06F40/205 , G06N20/00
Abstract: A method and an apparatus for extracting SPO triples, an electronic device, and a storage medium are related to the field of artificial intelligence technologies. The solution may include: inputting annotated training data into each of multiple extraction models; predicting SPO triples satisfying defined relations in the annotated training data through each of multiple extraction models; combining the predicted SPO triples corresponding to each of multiple extraction models; extracting SPO triples satisfying screening conditions from the combined SPO triples; mining SPO triples with missing annotations from the annotated training data based on the SPO triples satisfying screening conditions, in response to that the SPO triples satisfying screening conditions do not satisfy output conditions; supplementing the SPO triples with missing annotations into the annotated training data; repeating the inputting, predicting, combining, extracting, mining and supplementing until the SPO triples satisfying screening conditions satisfy the output conditions.
-
公开(公告)号:US20210216725A1
公开(公告)日:2021-07-15
申请号:US17147881
申请日:2021-01-13
Inventor: Shuangjie LI , Miao YU , Yabing SHI , Xuefeng HAO , Xunchao SONG , Ye JIANG , Yang ZHANG , Yong ZHU
IPC: G06F40/40 , G06F40/289 , G06N20/00
Abstract: A method and an apparatus for processing information are provided. The method can include: acquiring a word sequence obtained by performing word segmentation on two paragraphs in a text; inputting the word sequence into a to-be-trained natural language processing model to generate a word vector corresponding to a word in the word sequence; inputting the word vector into a preset processing layer of the to-be-trained natural language processing model; predicting whether the two paragraphs are adjacent, and a replaced word in the two paragraphs; and acquiring reference information of the two paragraphs, and training the to-be-trained natural language processing model to obtain a trained natural language processing model, based on the prediction result and the reference information.
-
10.
公开(公告)号:US20200057788A1
公开(公告)日:2020-02-20
申请号:US16539796
申请日:2019-08-13
Inventor: Fang HUANG , Shuangjie LI , Bingyang YU , Yabing SHI , Haijin LIANG , Yang ZHANG , Yong ZHU
IPC: G06F16/958 , G06F16/953
Abstract: Embodiments of the present disclosure provide a method, an apparatus and a device for generating entity relationship data, and a storage medium. The method includes: obtaining webpage source data corresponding to a target webpage; identifying at least one key value block from the webpage source data, wherein the key value block comprises at least one key value pair; identifying body values corresponding to the at least one key value block from the webpage source data; and generating entity relationship data corresponding to the target webpage according to the key value blocks and the body values corresponding to the key value blocks. With the technical solution the present disclosure, the webpage universality may be improved, labor cost may be reduced, and output quantity of the entity relationship data may be increased.
-
-
-
-
-
-
-
-
-