-
公开(公告)号:US20220147574A9
公开(公告)日:2022-05-12
申请号:US16454083
申请日:2019-06-27
Applicant: International Business Machines Corporation
Inventor: ROY BAR-HAIM , Noam Slonim , Orith Toledo-Ronen
IPC: G06F16/9032 , G06F17/27 , G06K9/00 , G06F16/955
Abstract: A computerized text analysis method that comprises: searching a resource of information with a search query comprising at least one of: (a) the specific debatable topic, and (b) a personal derivation of the specific debatable topic, to obtain a list of indices whose index subject contains the personal derivation and/or the specific debatable topic; determining, by applying a rule-based classifier, whether the index subject of each of the indices is (i) in favor of the debatable topic or (ii) against the debatable topic; detecting, in each of the indices, hyperlinks to encyclopedic entries whose entry subjects are person names; and determining that: if the index subject of each of the one or more indices is in favor of the specific debatable topic, then the persons are in favor of the specific debatable topic, and vice versa.
-
公开(公告)号:US20210157980A1
公开(公告)日:2021-05-27
申请号:US16697224
申请日:2019-11-27
Applicant: International Business Machines Corporation
Inventor: Yonatan Bilu , Liat Ein Dor , Noam Slonim
IPC: G06F40/205 , G06F40/106 , G06F16/35 , G06F16/93 , G06K9/62 , G06N20/00 , G06N3/08
Abstract: A system for identifying in a corpus of documents at least one argument relevant to an identified topic, comprising at least one hardware processor adapted to: producing a plurality of topic-related sentences relevant to the identified topic, each extracted from a document of the corpus of documents; producing a plurality of synthetic documents, each created by appending to a sentence of the plurality of topic-related sentences an identified amount of other sentences extracted from the respective document the topic-related sentence was extracted therefrom; identifying at least one argument relevant to the identified topic by inputting each of the plurality of synthetic documents to at least one machine learning model trained to identify an argument in response to a document; and outputting the at least one argument.
-
公开(公告)号:US20180260476A1
公开(公告)日:2018-09-13
申请号:US15453918
申请日:2017-03-09
Applicant: International Business Machines Corporation
Inventor: Roy Bar-Haim , Noam Slonim , Orith Toledo-Ronen
IPC: G06F17/30
CPC classification number: G06F16/353 , G06F16/31 , G06F16/3334 , G06F16/334 , G06F16/93 , G06F16/9566 , G06F17/27
Abstract: A computerized text analysis method that comprises: searching a resource of information with a search query comprising at least one of: (a) the specific debatable topic, and (b) a personal derivation of the specific debatable topic, to obtain a list of indices whose index subject contains the personal derivation and/or the specific debatable topic; determining, by applying a rule-based classifier, whether the index subject of each of the indices is (i) in favor of the debatable topic or (ii) against the debatable topic; detecting, in each of the indices, hyperlinks to encyclopedic entries whose entry subjects are person names; and determining that: if the index subject of each of the one or more indices is in favor of the specific debatable topic, then the persons are in favor of the specific debatable topic, and vice versa.
-
公开(公告)号:US20250021812A1
公开(公告)日:2025-01-16
申请号:US18350269
申请日:2023-07-11
Applicant: International Business Machines Corporation
Inventor: Leshem Choshen , Elad Venezian , Shachar Batya Don-Yehiya , Yoav Avraham Katz , Noam Slonim
IPC: G06N3/08 , G06N3/0455
Abstract: Systems and techniques that facilitate base model selection are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory that can execute the computer executable components stored in memory. The computer executable components can comprise a comparison component that that finetunes a pretrained machine learning model and one or more candidate models selected based on the ranking of the plurality of finetuned machine learning models on one or more target datasets, compares performance of the one or more candidate models to a defined performance metric, and selects a base model from the pretrained machine learning model and the one or more candidate models based on the performance of the one or more candidate models over the one or more target datasets.
-
公开(公告)号:US12093645B2
公开(公告)日:2024-09-17
申请号:US17474364
申请日:2021-09-14
Applicant: International Business Machines Corporation
Inventor: Eyal Shnarch , Ariel Gera , Alon Halfon , Lena Dankin , Leshem Choshen , Ranit Aharonov , Noam Slonim
IPC: G06F40/279 , G10L25/30
CPC classification number: G06F40/279 , G10L25/30
Abstract: An example system includes a processor to pre-train a transformer-based language model on a general domain. The processor can inter-train the pre-trained transformer-based language model using partitioning and classification to generate an inter-trained transformer-based pre-trained language model. The processor can then fine-tune the inter-trained transformer-based pre-trained language model on a target task to generate a fine-tuned transformer-based language model.
-
公开(公告)号:US10831793B2
公开(公告)日:2020-11-10
申请号:US16167552
申请日:2018-10-23
Applicant: International Business Machines Corporation
Inventor: Ranit Aharonov , Liat Ein Dor , Alon Halfon , Yosi Mass , Ilya Shnayderman , Noam Slonim , Elad Venezian
Abstract: A method of estimating a thematic similarity of sentences, comprising receiving a corpus of a plurality of documents describing a plurality of topics where each document comprises a plurality of sentences arranged in a plurality of sections, constructing sentence triplets for at least some of the sentences, each sentence triplet comprising a respective sentence, a respective positive sentence selected randomly from the section comprising the respective sentence and a respective negative sentence selected randomly from another section, training a first neural network with the sentence triplets to identify sentence-sentence vectors mapping each sentence with a shorter distance to its respective positive sentence compared to the distance to its respective negative sentence and outputting the first neural network for estimating thematic similarity between a pair of sentences by computing a distance between the sentence-sentence vectors produced for each sentence of the pair by the first neural network.
-
公开(公告)号:US10810375B2
公开(公告)日:2020-10-20
申请号:US16029605
申请日:2018-07-08
Applicant: International Business Machines Corporation
Inventor: Yosi Mass , Amir Menczel , Dafna Sheinwald , Ilya Shnayderman , Noam Slonim
IPC: G06F40/30 , G06F40/295 , G06F40/247 , G06F40/253
Abstract: A method comprising: operating at least one hardware processor for: receiving, as input, at least one named entity, modifying said named entity based on a plurality of modification rules to generate a set of candidate named entities corresponding to said named entity, and identifying, for at least one candidate named entity in said set of candidate named entities, an article in a knowledge base of articles, wherein a title of said article matches said candidate named entity.
-
公开(公告)号:US10776587B2
公开(公告)日:2020-09-15
申请号:US15206326
申请日:2016-07-11
Applicant: International Business Machines Corporation
Inventor: Yonatan Bilu , Ran Levy , Noam Slonim
Abstract: A computer-implemented method, computerized apparatus and computer program product for claim generation, the method comprising: selecting at least one subject according to a given topic; selecting at least one verb from a first data source; selecting at least one object from a second data source; generating one or more candidate claim sentences, each of which composed of a subject selected from the at least one subject, a verb selected from the at least one verb and an object selected from the at least on object; and determining validity of the candidate claim sentences using a machine learning process.
-
公开(公告)号:US20160350278A1
公开(公告)日:2016-12-01
申请号:US14721007
申请日:2015-05-26
Applicant: International Business Machines Corporation
Inventor: Ehud Aharoni , Roy Bar-Haim , Indrajit Bhattacharya , Francesco Dinuzzo , Dan Gutfreund , Amrita Saha , Noam Slonim , Chen Yanover
IPC: G06F17/27
CPC classification number: G06F17/2705 , G06F17/27 , G06F17/2785 , G06F17/30011 , G06F17/30283
Abstract: A method comprising using at least one hardware processor for: receiving (a) a proposition and (b) a plurality of claims; identifying a local claim polarity of each claim of the plurality of claims with respect to the proposition; calculating a pairwise claim polarity agreement score for each pair of claims of the pairs of claims reflecting the likelihood of said each pair of claims to have the same claim polarity, wherein the pairwise claim polarity agreement score is associated with each claim of the pair of claims; and determining a global claim polarity for each claim of the plurality of claims based on the local claim polarity of the claim and pairwise claim polarity agreement scores associated with said each claim.
Abstract translation: 一种方法,包括使用至少一个硬件处理器:接收(a)命题和(b)多个权利要求; 识别关于该命题的多个权利要求中的每个权利要求的本地声明极性; 计算反映所述每对权利要求具有相同的权利要求极性的可能性的每对权利要求的成对权利要求极性协议分数,其中所述成对索赔极性协议分数与所述一对权利要求的每个权利要求相关联 ; 以及基于所述权利要求的本地权利要求极性和与所述每个权利要求相关联的成对索赔极性协议分数来确定所述多个权利要求中的每个权利要求的全局权利要求极性。
-
公开(公告)号:US20150370887A1
公开(公告)日:2015-12-24
申请号:US14698854
申请日:2015-04-29
Applicant: International Business Machines Corporation
Inventor: Mitesh Khapra , Vikas Raykar , Amrita Saha , Noam Slonim , Ashish Verma
Abstract: A method comprising using at least one hardware processor for: receiving a topic under consideration (TUC) and a set of claims referring to the TUC; identifying semantic similarity relations between claims of the set of claims; clustering the claims into a plurality of claim clusters based on the identified semantic similarity relations, wherein said claim clusters represent semantically different claims of the set of claims; and generating a list of non-redundant claims comprising said semantically different claims.
Abstract translation: 一种方法,包括使用至少一个硬件处理器:接收所考虑的主题(TUC)和参考所述TUC的一组权利要求; 识别该组权利要求的权利要求之间的语义相似关系; 基于所识别的语义相似关系将权利要求聚类成多个权利要求群集,其中所述权利要求群集表示所述权利要求组的语义上不同的权利要求; 以及生成包括所述语义上不同的权利要求的非冗余索赔的列表。
-
-
-
-
-
-
-
-
-