Patent search ap:("INTERNATIONAL BUSINESS MACHINES CORPORATION") AND inv:"Naama Tepper" Page 1

1.

发明申请
EXPECTED GROUP CHAT SEGMENT DURATION 审中-公开

公开(公告)号：US20200084055A1

公开(公告)日：2020-03-12

申请号：US16684949

申请日：2019-11-15

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Jonathan F. Brunn , Rachael M.H. Dickens , Jonathan Dunne , Ethan A. Geyer , Liam S. HARPUR , Bo Jiang , ANDREW PENROSE , Naama Tepper

IPC: H04L12/18 , G06Q10/04 , H04L12/58 , G06F17/18

Abstract: A method, computer system, and computer program product for calculating a group chat segment duration is provided. The embodiment may include capturing a plurality of group chat messages from a chat message repository. The embodiment may also include determining a probability distribution based on analyzing the captured group chat messages over a time vector. The embodiment may further include calculating a time parameter based on the determined probability distribution. The embodiment may also include calculating a content parameter based on one or more relevant chat topics. The embodiment may further include calculating an attendee parameter based on a plurality of attendees and one or more attendee associations. The embodiment may also include determining a chat duration prediction based on the calculated time parameter, the calculated content parameter, and the calculated attendee parameter.

2.

发明申请
GROUPING MESSAGES BASED ON TEMPORAL AND MULTI-FEATURE SIMILARITY 审中-公开

公开(公告)号：US20190121907A1

公开(公告)日：2019-04-25

申请号：US15791200

申请日：2017-10-23

Applicant: International Business Machines Corporation

Inventor： Jonathan F. Brunn , Daniel Dulaney , Ami Dewar , Ethan A. Geyer , Bo Jiang , Rachael Dickens , Scott E. Chapman , Thomas Blanchflower , Naama Tepper

IPC: G06F17/30 , H04W4/12

Abstract: Message grouping using temporal and multi-factor similarity includes grouping multiple messages of a corpus in a group messaging system into a number of message bursts. Each message burst includes a number of messages that have a temporal relationship. Multiple of the number of message bursts are grouped into a message cluster. The grouping is based on a similarity of the number of message bursts as defined by multiple features of the message bursts.

3.

发明授权
Familiarity-based text classification framework selection 有权

公开(公告)号：US11222058B2

公开(公告)日：2022-01-11

申请号：US15840559

申请日：2017-12-13

Applicant: International Business Machines Corporation

Inventor： Ethan A. Geyer , Jonathan F. Brunn , Jonathan Dunne , Naama Tepper

IPC: G06F16/00 , G06F16/35 , H04L12/58 , G06N3/08 , G06F16/31

Abstract: Familiarity-based text classification framework selection is described. A list of participants in an electronic message thread is selected. For each pairing of participants, a familiarity score is determined based on a number of criteria. A familiarity model is formed based on multiple familiarity scores and a text classification framework for the electronic message thread is selected based on the familiarity model.

4.

发明申请
LANGUAGE-MODEL-BASED DATA AUGMENTATION METHOD FOR TEXTUAL CLASSIFICATION TASKS WITH LITTLE DATA 有权

公开(公告)号：US20210350076A1

公开(公告)日：2021-11-11

申请号：US16870917

申请日：2020-05-09

Applicant: International Business Machines Corporation

Inventor： Amir Kantor , Ateret Anaby Tavor , Boaz Carmeli , Esther Goldbraich , GEORGE KOUR , Segev Shlomov , Naama Tepper , Naama Zwerdling

IPC: G06F40/279 , G06N20/00 , G06N5/04

Abstract: Embodiments of the present systems and methods may provide techniques for augmenting textual data that may be used for textual classification tasks. Embodiments of such techniques may provide the capability to synthesize labeled data to improve text classification tasks. Embodiments may be specifically useful when only a small amount of data is available, and provide improved performance in such cases. For example, in an embodiment, a method implemented in a computer system may comprise a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, and the method may comprise fine-tuning a language model using a training dataset, synthesizing a plurality of samples using the fine-tuned language model, filtering the plurality of synthesized samples, and generating an augmented training dataset comprising the training dataset and the filtered plurality of synthesized sentences.

5.

发明授权
Expected group chat segment duration 有权

公开(公告)号：US11057230B2

公开(公告)日：2021-07-06

申请号：US16684949

申请日：2019-11-15

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Jonathan F. Brunn , Rachael M. H. Dickens , Jonathan Dunne , Ethan A. Geyer , Liam S. Harpur , Bo Jiang , Andrew Penrose , Naama Tepper

IPC: H04L12/18 , G06Q10/04 , H04L12/58 , G06F17/18 , G06F3/048 , H04L29/06 , G06F40/30

Abstract: A method, computer system, and computer program product for calculating a group chat segment duration is provided. The embodiment may include capturing a plurality of group chat messages from a chat message repository. The embodiment may also include determining a probability distribution based on analyzing the captured group chat messages over a time vector. The embodiment may further include calculating a time parameter based on the determined probability distribution. The embodiment may also include calculating a content parameter based on one or more relevant chat topics. The embodiment may further include calculating an attendee parameter based on a plurality of attendees and one or more attendee associations. The embodiment may also include determining a chat duration prediction based on the calculated time parameter, the calculated content parameter, and the calculated attendee parameter.

6.

发明申请
FAMILIARITY-BASED TEXT CLASSIFICATION FRAMEWORK SELECTION 审中-公开

公开(公告)号：US20190179955A1

公开(公告)日：2019-06-13

申请号：US15840559

申请日：2017-12-13

Applicant: International Business Machines Corporation

Inventor： Ethan A. Geyer , Jonathan F. Brunn , Jonathan Dunne , Naama Tepper

IPC: G06F17/30 , G06N3/08 , H04L12/58

Abstract: Familiarity-based text classification framework selection is described. A list of participants in an electronic message thread is selected. For each pairing of participants, a familiarity score is determined based on a number of criteria. A familiarity model is formed based on multiple familiarity scores and a text classification framework for the electronic message thread is selected based on the familiarity model.

7.

发明申请
EXPECTED GROUP CHAT SEGMENT DURATION 审中-公开

公开(公告)号：US20190103982A1

公开(公告)日：2019-04-04

申请号：US15720265

申请日：2017-09-29

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Jonathan F. Brunn , Rachael M.H. Dickens , Jonathan Dunne , Ethan A. Geyer , Liam S. HARPUR , Bo Jiang , ANDREW PENROSE , Naama Tepper

IPC: H04L12/18 , G06Q10/04 , G06F17/18 , H04L12/58

Abstract: A method, computer system, and computer program product for calculating a group chat segment duration is provided. The embodiment may include capturing a plurality of group chat messages from a chat message repository. The embodiment may also include determining a probability distribution based on analyzing the captured group chat messages over a time vector. The embodiment may further include calculating a time parameter based on the determined probability distribution. The embodiment may also include calculating a content parameter based on one or more relevant chat topics. The embodiment may further include calculating an attendee parameter based on a plurality of attendees and one or more attendee associations. The embodiment may also include determining a chat duration prediction based on the calculated time parameter, the calculated content parameter, and the calculated attendee parameter.

8.

发明授权
Dataset balancing via quality-controlled sample generation 有权

公开(公告)号：US11797516B2

公开(公告)日：2023-10-24

申请号：US17317922

申请日：2021-05-12

Applicant: International Business Machines Corporation

Inventor： Naama Tepper , Esther Goldbraich , Boaz Carmeli , Naama Zwerdling , George Kour , Ateret Anaby Tavor

IPC: G06F16/23 , G06N20/00

CPC classification number: G06F16/2365 , G06N20/00

Abstract: Balancing an imbalanced dataset, by: Receiving a balancing policy and the imbalanced dataset. Performing initial adjustment of the imbalanced dataset to comply with the balancing policy, by: oversampling one or more underrepresented classes, and, if one or more of the classes are overrepresented, undersampling them. Operating a generative machine learning model to generate samples for the one or more underrepresented classes, based on the initially-adjusted dataset. Operating a machine learning classification model to label the generated samples with class labels corresponding to the one or more underrepresented classes. Selecting some of the generated samples which, according to the labeling, have a relatively high probability of preserving their class labels. Composing a balanced dataset which complies with the balancing policy and comprises: the samples belonging to the one or more underrepresented classes, the selected generated samples, and an undersampling of the samples belonging to the one or more overrepresented classes.

9.

发明授权
Language-model-based data augmentation method for textual classification tasks with little data 有权

公开(公告)号：US11526667B2

公开(公告)日：2022-12-13

申请号：US16870917

申请日：2020-05-09

Applicant: International Business Machines Corporation

Inventor： Amir Kantor , Ateret Anaby Tavor , Boaz Carmeli , Esther Goldbraich , George Kour , Segev Shlomov , Naama Tepper , Naama Zwerdling

IPC: G06F40/279 , G06N5/04 , G06N20/00

Abstract: Embodiments of the present systems and methods may provide techniques for augmenting textual data that may be used for textual classification tasks. Embodiments of such techniques may provide the capability to synthesize labeled data to improve text classification tasks. Embodiments may be specifically useful when only a small amount of data is available, and provide improved performance in such cases. For example, in an embodiment, a method implemented in a computer system may comprise a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, and the method may comprise fine-tuning a language model using a training dataset, synthesizing a plurality of samples using the fine-tuned language model, filtering the plurality of synthesized samples, and generating an augmented training dataset comprising the training dataset and the filtered plurality of synthesized sentences.

10.

发明申请
DATASET BALANCING VIA QUALITY-CONTROLLED SAMPLE GENERATION 有权

公开(公告)号：US20220374410A1

公开(公告)日：2022-11-24

申请号：US17317922

申请日：2021-05-12

Applicant: International Business Machines Corporation

Inventor： Naama Tepper , Esther Goldbraich , Boaz Carmeli , Naama Zwerdling , GEORGE KOUR , Ateret Anaby Tavor

IPC: G06F16/23 , G06N20/00

Abstract: Balancing an imbalanced dataset, by: Receiving a balancing policy and the imbalanced dataset. Performing initial adjustment of the imbalanced dataset to comply with the balancing policy, by: oversampling one or more underrepresented classes, and, if one or more of the classes are overrepresented, undersampling them. Operating a generative machine learning model to generate samples for the one or more underrepresented classes, based on the initially-adjusted dataset. Operating a machine learning classification model to label the generated samples with class labels corresponding to the one or more underrepresented classes. Selecting some of the generated samples which, according to the labeling, have a relatively high probability of preserving their class labels. Composing a balanced dataset which complies with the balancing policy and comprises: the samples belonging to the one or more underrepresented classes, the selected generated samples, and an undersampling of the samples belonging to the one or more overrepresented classes.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification