-
公开(公告)号:US20240338391A1
公开(公告)日:2024-10-10
申请号:US18295757
申请日:2023-04-04
申请人: Recruit Co., Ltd.
发明人: Yutong Shao , Nikita Bhutani , Sajjadur Rahman , Estevam Hruschka
CPC分类号: G06F16/313 , G06F16/38
摘要: Disclosed embodiments relate to entity set expansion to associate with a text corpus. Techniques can include receiving unstructured data and a set of concepts associated with the data to determine, using a language model, a set of candidate entities in the data associated with the set of concepts, wherein the association is measured based on the relevancy of each candidate entity of the set of candidate entities to context of the data. Techniques can then determine, using a plurality of methods, associations between each candidate entity in the set of candidate entities and each concept in the concept of the set of concepts, wherein the each candidate entity is assigned a rank for each method of the plurality of methods. Techniques can use the assigned ranks to determine a combined rank of the each candidate entity of the set of candidate entities, wherein the combined rank of the each candidate entity is based on the assigned rank of the each candidate entity for the each method of the plurality of methods. Techniques can finally expand the entity set by determining a subset of entities of the set of candidate entities based on the combined rank of each candidate entity, wherein the subset of entities form the expanded entity set associated with the data.
-
公开(公告)号:US20230342558A1
公开(公告)日:2023-10-26
申请号:US17660813
申请日:2022-04-26
申请人: Recruit Co., Ltd.
发明人: Jin WANG , Yuliang Li , Wataru HIROTA
IPC分类号: G06F40/40 , G06F40/284 , G06F40/205
CPC分类号: G06F40/40 , G06F40/284 , G06F40/205
摘要: Disclosed embodiments relate to generalized entity matching. Techniques can include receiving a data pair of two entities that may be pre-processed to have parsable data structures, and serializing the data pair into a sequence of tokens based on data structure of each entity in the data pair. Techniques can further include encoding the serialized data pair to include topic attributes that may be mapped to data in the data pair and the topic of the mapped data matches the topic represented by topic attribute and the data in the data pair is concatenated. Techniques can further include pooling attributes in the data pair based on contextualized attributed representations of each encoded entity in the data pair and schema of each entity of the data pairs, where the contextual attribute representations are based on a first token of each encoded attribute in the sequence of tokens, and predicting matching labels between the data pairs based on pooled attributes.
-
公开(公告)号:US20190265223A1
公开(公告)日:2019-08-29
申请号:US16083148
申请日:2017-03-08
申请人: Recruit Co., Ltd.
发明人: Ryo Irisawa
IPC分类号: G01N33/483 , G01N15/14 , G01N21/85
摘要: A simple sperm test kit 10 for performing a simple sperm test has a substrate 20 capable of being placed over a camera 30 of an information terminal 11, a recess 21 for containing semen A provided in the surface of the substrate 20, a cover 22 that covers the recess 21 for allowing external light to enter the recess 21, and a lens 23 provided within the substrate 20 on the lower side of the recess 21 for magnifying the semen A in the recess 21 and projecting an image on the back side of the substrate 20.
-
公开(公告)号:US10176654B2
公开(公告)日:2019-01-08
申请号:US15769419
申请日:2016-10-19
申请人: Recruit Co., Ltd.
摘要: A suspicious person detection technology which is less likely to cause a blind spot of detection of a suspicious person is provided. A suspicious person detection system detects a suspicious person present in a predetermined area and includes a probe request detection terminal (100) configured to detect a probe request transmitted from a mobile terminal (400) to generate probe information including first identification information specific to the mobile terminal which transmits the probe information, and an analyzing apparatus (200) configured to acquire the probe information from the probe request detection terminal, and, in the case where the first identification information included in the probe information matches none of one or more pieces of second identification information set in advance, transmit suspicious person information indicating that a suspicious person is detected to a predetermined information processing apparatus (300).
-
公开(公告)号:US20240289629A1
公开(公告)日:2024-08-29
申请号:US18305657
申请日:2023-04-24
申请人: Recruit Co., Ltd.
发明人: Runhui Wang , Yuliang Li , Jin Wang
IPC分类号: G06N3/09 , G06F16/28 , G06N3/045 , G06N3/0464
CPC分类号: G06N3/09 , G06F16/285 , G06N3/045 , G06N3/0464
摘要: Disclosed embodiments relate to data management of entity pairs. Techniques can include receiving at least two sets of data and a data management task request with each including a set of entities. Techniques can determine a location of each entity in received data sets in a representative space by determining representative structure of the set of entities. Techniques can then for an entity, a set of representative entity pairs from each set of the at least two sets of data based on how close they are in the representative space. Technique can then analyze the set of representative entity pairs to identify most similar entity pairs include in a set of candidate pairs by determining closeness of location of entities in each entity pair in the representative space. Technique can then determine matched entity pairs of the candidate pairs using a first machine learning model is trained using the candidate pairs by applying labels, and utilizing the matched pairs to perform the requested data management task.
-
6.
公开(公告)号:US20230281390A1
公开(公告)日:2023-09-07
申请号:US18295735
申请日:2023-04-04
申请人: Recruit Co., Ltd.
发明人: Yoshihiko SUHARA , Behzad GOLSHAN , Yuliang LI , Chen CHEN , XIAOLAN WANG , JINFENG LI , WANG-CHIEW TAN , ÇAGATAY DEMIRALP , AARON TRAYLOR
IPC分类号: G06F40/284 , G06F16/35 , G06F18/211 , G06N7/01
CPC分类号: G06F40/284 , G06F16/35 , G06F18/211 , G06N7/01
摘要: Disclosed embodiments relate to natural language processing. Techniques can include receiving input text, extracting, from the input text, at least one modifier and aspect pair, receiving data from a knowledgebase, based on the at least one modifier and aspect pair and commonsense data, generate one or more premise embeddings, convert the input text into tokens, generating at least one vector for one or more of the tokens based on an analysis of the tokens, combine the at least one vector with the one or more premise embeddings to create at least one combined vector, and analyze the at least one combined vector wherein the analysis generates an output indicative of a feature of the input text.
-
公开(公告)号:US20220067279A1
公开(公告)日:2022-03-03
申请号:US17008569
申请日:2020-08-31
申请人: Recruit Co., Ltd.,
IPC分类号: G06F40/263
摘要: Disclosed embodiments relate to natural language processing. Techniques can include obtaining an encoding model, obtaining a first sentence in a first language and a label associated with the first sentence, obtaining a second sentence in a second language, encoding the first sentence and second sentence using the encoding model, determining the intent of the first encoded sentence, determining the language of the first encoded sentence and the language of the second encoded sentence, and updating the encoding model based on the determined intent of the first encoded sentence, the label, the determined language of the first encoded sentence, and the determined language of the second encoded sentence
-
公开(公告)号:US20190164109A1
公开(公告)日:2019-05-30
申请号:US16073447
申请日:2017-01-25
申请人: Recruit Co., Ltd.
发明人: Yoshihiko Suhara , Hideki Awashima , Hidekazu Oiwa
摘要: Even if the number of documents for the number of words is insufficient, appropriate similarity degree learning is performed. An analysis method by a topic model is used to perform learning of a degree of similarity between recruitment information and resume information. By analyzing recruitment information registered with a recruitment card database DB 310 and resume information registered with a resume database DB 320 using a topic model, a characteristics extracting portion 330 collects words (keywords) extracted from documents constituting the recruitment information and the resume information for each topic; and a similarity degree learning portion 360 performs similarity degree learning for each topic.
-
公开(公告)号:US20240020485A1
公开(公告)日:2024-01-18
申请号:US18366890
申请日:2023-08-08
申请人: Recruit Co., Ltd.
发明人: Behzad GOLSHAN , Chen Chen , Wang-Chiew Tan , Danni Ma
IPC分类号: G06F40/35 , G06F40/268 , G06F40/284 , G06F18/2323
CPC分类号: G06F40/35 , G06F40/268 , G06F40/284 , G06F18/2323
摘要: Disclosed embodiments relate to aligning pairs of sentences. Techniques can include receiving a plurality of sentences; generating a graph for each of at least two sentences of the plurality of sentences, wherein generating a graph for each sentence of the at least two sentences comprises: identifying one or more tokens for the sentence; and connecting via edges the one or more tokens; generating a combined graph for the at least two sentences wherein generating a combined graph comprises: aligning the identified tokens of the at least two sentences of the plurality of sentences; identifying matching and non-matching tokens between the at least two sentences based on the alignment; and merging matching tokens into a combined graph node.
-
公开(公告)号:US20220229984A1
公开(公告)日:2022-07-21
申请号:US17151088
申请日:2021-01-15
申请人: Recruit Co., Ltd.,
发明人: Zhengjie Miao , Yuliang Li , Xiaolan Wang , Wang-Chiew Tan
IPC分类号: G06F40/284 , G06N20/00 , G06N5/04 , G06F40/289
摘要: Disclosed embodiments relate to extracting classification information from input text. Techniques can include obtaining input text, identifying a plurality of tokens in the input text, pre-training a machine learning model, determining tagging information of the plurality of tokens using a first classification layer of the machine learning model, pairing sequences of tokens using the tagging information associated with the plurality of tokens, wherein the paired sequences of tokens are determined by a second classification layer, determining one or more attribute classifiers to apply to the one or more paired sequences, wherein the attribute classifiers are determined by a third classification layer of the machine learning model, evaluating sentiments of the paired sequences, wherein the sentiments of the paired sequences are determined by a fourth classification layer of the language machine learning model, aggregating sentiments of the paired sequences associated with an attribute classifier, and storing the aggregated sentiments.
-
-
-
-
-
-
-
-
-