Patent search ap:("AT&T Intellectual Property I Page L.P.") AND inv:"Vivek Kumar Rangarajan Sridhar"

1.

发明授权
Unsupervised topic modeling for short texts 有权

公开(公告)号：US10241995B2

公开(公告)日：2019-03-26

申请号：US15888385

申请日：2018-02-05

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Vivek Kumar Rangarajan Sridhar

IPC: G06F17/27 , G10L25/30 , H04W4/14

Abstract: Topics are determined for short text messages using an unsupervised topic model. In a training corpus created from a number of short text messages, a vocabulary of words is identified, and for each word a distributed vector representation is obtained by processing windows of the corpus having a fixed length. The corpus is modeled as a Gaussian mixture model in which Gaussian components represent topics. To determine a topic of a sample short text message, a posterior distribution over the corpus topics is obtained using the Gaussian mixture model.

2.

发明授权
Unsupervised topic modeling for short texts 有权

公开(公告)号：US09928231B2

公开(公告)日：2018-03-27

申请号：US15401446

申请日：2017-01-09

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Vivek Kumar Rangarajan Sridhar

IPC: G06F17/27 , G10L25/30 , H04W4/14

CPC classification number: G06F17/2715 , G06F17/2785 , G10L25/30 , H04W4/14

Abstract: Topics are determined for short text messages using an unsupervised topic model. In a training corpus created from a number of short text messages, a vocabulary of words is identified, and for each word a distributed vector representation is obtained by processing windows of the corpus having a fixed length. The corpus is modeled as a Gaussian mixture model in which Gaussian components represent topics. To determine a topic of a sample short text message, a posterior distribution over the corpus topics is obtained using the Gaussian mixture model.

3.

发明授权
System and method for unsupervised text normalization using distributed representation of words 有权

公开(公告)号：US11501066B2

公开(公告)日：2022-11-15

申请号：US16889609

申请日：2020-06-01

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Vivek Kumar Rangarajan Sridhar

IPC: G06F40/232 , G06F40/58 , G06Q50/00

Abstract: A system, method and computer-readable storage devices for providing unsupervised normalization of noisy text using distributed representation of words. The system receives, from a social media forum, a word having a non-canonical spelling in a first language. The system determines a context of the word in the social media forum, identifies the word in a vector space model, and selects an “n-best” vector paths in the vector space model, where the n-best vector paths are neighbors to the vector space path based on the context and the non-canonical spelling. The system can then select, based on a similarity cost, a best path from the n-best vector paths and identify a word associated with the best path as the canonical version.

4.

发明申请
Unsupervised Topic Modeling For Short Texts 审中-公开

公开(公告)号：US20190179891A1

公开(公告)日：2019-06-13

申请号：US16268583

申请日：2019-02-06

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Vivek Kumar Rangarajan Sridhar

IPC: G06F17/27 , H04W4/14 , G10L25/30

Abstract: Topics are determined for short text messages using an unsupervised topic model. In a training corpus created from a number of short text messages, a vocabulary of words is identified, and for each word a distributed vector representation is obtained by processing windows of the corpus having a fixed length. The corpus is modeled as a Gaussian mixture model in which Gaussian components represent topics. To determine a topic of a sample short text message, a posterior distribution over the corpus topics is obtained using the Gaussian mixture model.

5.

发明授权
System and method for locating bilingual web sites 有权

公开(公告)号：US10114818B2

公开(公告)日：2018-10-30

申请号：US15294883

申请日：2016-10-17

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Luciano De Andrade Barbosa , Srinivas Bangalore , Vivek Kumar Rangarajan Sridhar

IPC: G06F17/28 , G06F17/27 , G06F17/30

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for bootstrapping a language translation system. A system configured to practice the method performs a bidirectional web crawl to identify a bilingual website. The system analyzes data on the bilingual website to make a classification decision about whether the root of the bilingual website is an entry point for the bilingual website. The bilingual site can contain pairs of parallel pages. Each pair can include a first website in a first language and a second website in a second language, and a first portion of the first web page corresponds to a second portion of the second web page. Then the system analyzes the first and second web pages to identify corresponding information pairs in the first and second languages, and extracts the corresponding information pairs from the first and second web pages for use in a language translation model.

6.

发明申请
Unsupervised Topic Modeling For Short Texts 有权

公开(公告)号：US20170116178A1

公开(公告)日：2017-04-27

申请号：US15401446

申请日：2017-01-09

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Vivek Kumar Rangarajan Sridhar

IPC: G06F17/27 , H04W4/14 , G10L25/30

CPC classification number: G06F17/2715 , G06F17/2785 , G10L25/30 , H04W4/14

Abstract: Topics are determined for short text messages using an unsupervised topic model. In a training corpus created from a number of short text messages, a vocabulary of words is identified, and for each word a distributed vector representation is obtained by processing windows of the corpus having a fixed length. The corpus is modeled as a Gaussian mixture model in which Gaussian components represent topics. To determine a topic of a sample short text message, a posterior distribution over the corpus topics is obtained using the Gaussian mixture model.

7.

发明授权
Unsupervised topic modeling for short texts 有权

公开(公告)号：US11030401B2

公开(公告)日：2021-06-08

申请号：US16268583

申请日：2019-02-06

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Vivek Kumar Rangarajan Sridhar

IPC: G06F40/216 , G06F40/30 , G10L25/30 , H04W4/14

Abstract: Topics are determined for short text messages using an unsupervised topic model. In a training corpus created from a number of short text messages, a vocabulary of words is identified, and for each word a distributed vector representation is obtained by processing windows of the corpus having a fixed length. The corpus is modeled as a Gaussian mixture model in which Gaussian components represent topics. To determine a topic of a sample short text message, a posterior distribution over the corpus topics is obtained using the Gaussian mixture model.

8.

发明申请
UNSUPERVISED TOPIC MODELING FOR SHORT TEXTS 有权
Title translation: 短暂主题的不可分割的主题建模

公开(公告)号：US20160110343A1

公开(公告)日：2016-04-21

申请号：US14519427

申请日：2014-10-21

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Vivek Kumar Rangarajan Sridhar

IPC: G06F17/27

CPC classification number: G06F17/2715 , G06F17/2785 , G10L25/30 , H04W4/14

Abstract: Topics are determined for short text messages using an unsupervised topic model. In a training corpus created from a number of short text messages, a vocabulary of words is identified, and for each word a distributed vector representation is obtained by processing windows of the corpus having a fixed length. The corpus is modeled as a Gaussian mixture model in which Gaussian components represent topics. To determine a topic of a sample short text message, a posterior distribution over the corpus topics is obtained using the Gaussian mixture model.

Abstract translation: 使用无监督主题模型确定短文本消息的主题。在从许多短文本消息创建的训练语料库中，识别词汇词，并且对于每个单词，通过处理具有固定长度的语料库的窗口来获得分布式向量表示。语料库被建模为高斯混合模型，其中高斯分量表示主题。为了确定样本短文本消息的主题，使用高斯混合模型获得语料库主题的后验分布。

9.

发明申请
SYSTEM AND METHOD FOR ENRICHING SPOKEN LANGUAGE TRANSLATION WITH DIALOG ACTS 有权
Title translation: 用对话语言强化语音翻译的系统和方法

公开(公告)号：US20130151232A1

公开(公告)日：2013-06-13

申请号：US13761549

申请日：2013-02-07

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventor： Srinivas Bangalore , Vivek Kumar Rangarajan Sridhar

IPC: G06F17/28

CPC classification number: G06F17/28 , G06F17/279 , G06F17/289

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for enriching spoken language translation with dialog acts. The method includes receiving a source speech signal, tagging dialog acts associated with the received source speech signal using a classification model, dialog acts being domain independent descriptions of an intended action a speaker carries out by uttering the source speech signal, producing an enriched hypothesis of the source speech signal incorporating the dialog act tags, and outputting a natural language response of the enriched hypothesis in a target language. Tags can be grouped into sets such as statement, acknowledgement, abandoned, agreement, question, appreciation, and other. The step of producing an enriched translation of the source speech signal uses a dialog act specific translation model containing a phrase translation table.

Abstract translation: 本文公开了系统，计算机实现的方法和有形计算机可读介质，用于通过对话行为丰富口语翻译。该方法包括使用分类模型来接收源语音信号，与接收到的源语音信号相关联的标签对话动作，对话体是说话者通过发出源语音信号来执行的预期动作的域独立描述，产生丰富的假设包含对话行为标签的源语音信号，并以目标语言输出丰富假说的自然语言响应。标签可以分组，如声明，确认，放弃，协议，问题，升值等。产生源语音信号的丰富翻译的步骤使用包含短语翻译表的对话行为特定翻译模型。

10.

发明授权
System and method for unsupervised text normalization using distributed representation of words 有权

公开(公告)号：US10083167B2

公开(公告)日：2018-09-25

申请号：US14506156

申请日：2014-10-03

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Vivek Kumar Rangarajan Sridhar

IPC: G06F17/27 , G06F17/28 , G06Q50/00

CPC classification number: G06F17/273 , G06F17/289 , G06Q50/01

Abstract: A system, method and computer-readable storage devices for providing unsupervised normalization of noisy text using distributed representation of words. The system receives, from a social media forum, a word having a non-canonical spelling in a first language. The system determines a context of the word in the social media forum, identifies the word in a vector space model, and selects an “n-best” vector paths in the vector space model, where the n-best vector paths are neighbors to the vector space path based on the context and the non-canonical spelling. The system can then select, based on a similarity cost, a best path from the n-best vector paths and identify a word associated with the best path as the canonical version.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification