Patent search ap:"L Venkata Subramaniam" Page 2

11.

发明授权
Systems and methods for discovering synonymous elements using context over multiple similar addresses 失效
Title translation: 使用上下文发现多个相似地址的同义元素的系统和方法

公开(公告)号：US08682898B2

公开(公告)日：2014-03-25

申请号：US12771543

申请日：2010-04-30

Applicant: Sachindra Joshi , Tanveer Faruquie , Hima Prasad Karanam , Marvin Mendelssohn , Mukesh Kumar Mohania , Angel Marie Smith , L Venkata Subramaniam , Girish Venkatachaliah

Inventor： Sachindra Joshi , Tanveer Faruquie , Hima Prasad Karanam , Marvin Mendelssohn , Mukesh Kumar Mohania , Angel Marie Smith , L Venkata Subramaniam , Girish Venkatachaliah

IPC: G06F7/00 , G06F17/00

CPC classification number: G06F17/2735 , G06F17/2795

Abstract: A clustering-based approach to data standardization is provided. Certain embodiments take as input a plurality of addresses, identify one or more features of the addresses, cluster the addresses based on the one or more features, utilize the cluster(s) to provide a data-based context useful in identifying one or more synonyms for elements contained in the address(es), and standardize the address(es) to an acceptable format, with one or more synonyms and/or other elements being added to or taken away from the input address(es) as part of the standardization process.

Abstract translation: 提供了基于聚类的数据标准化方法。某些实施例将多个地址作为输入，识别地址的一个或多个特征，基于一个或多个特征聚集地址，利用群集提供用于识别一个或多个同义词的基于数据的上下文对于包含在地址中的元素，并将地址标准化为可接受的格式，其中一个或多个同义词和/或其他元素作为标准化的一部分被添加到或从输入地址中取走处理。

12.

发明授权
Method for automatically identifying sentence boundaries in noisy conversational data 有权
Title translation: 在嘈杂会话数据中自动识别句子边界的方法

公开(公告)号：US08364485B2

公开(公告)日：2013-01-29

申请号：US11845462

申请日：2007-08-27

Applicant: Tetsuya Nasukawa , Diwakar Punjani , Shourya Roy , L. Venkata Subramaniam , Hironori Takeuchi

Inventor： Tetsuya Nasukawa , Diwakar Punjani , Shourya Roy , L. Venkata Subramaniam , Hironori Takeuchi

IPC: G10L15/04

CPC classification number: G10L15/26

Abstract: Sentence boundaries in noisy conversational transcription data are automatically identified. Noise and transcription symbols are removed, and a training set is formed with sentence boundaries marked based on long silences or on manual markings in the transcribed data. Frequencies of head and tail n-grams that occur at the beginning and ending of sentences are determined from the training set. N-grams that occur a significant number of times in the middle of sentences in relation to their occurrences at the beginning or ending of sentences are filtered out. A boundary is marked before every head n-gram and after every tail n-gram occurring in the conversational data and remaining after filtering. Turns are identified. A boundary is marked after each turn, unless the turn ends with an impermissible tail word or is an incomplete turn. The marked boundaries in the conversational data identify sentence boundaries.

Abstract translation: 嘈杂会话转录数据中的句子边界自动识别。删除噪声和转录符号，并且形成一个训练集，其中以基于长期沉默或手写标记的转录数据标记的句子边界。从训练集确定在句子的开头和结尾出现的头和尾n-gram的频率。在句子中间出现相当于句子开头或结尾的出现次数的N-gram被过滤掉。在每个头n-gram之前和之后的每个尾部n-gram出现在对话数据中并且在过滤之后保留边界。确认车辙。每转后，边界都会被标记出来，除非转弯以不允许的尾字结束，或者是不完整的转弯。会话数据中的标记边界识别句子边界。

13.

发明申请
Cleansing a Database System to Improve Data Quality 审中-公开
Title translation: 清理数据库系统以提高数据质量

公开(公告)号：US20120150825A1

公开(公告)日：2012-06-14

申请号：US12966281

申请日：2010-12-13

Applicant: Snigdha Chaturvedi , Tanveer A. Faruquie , Hima P. Karanam , Mukesh K. Mohania , L. Venkata Subramaniam

Inventor： Snigdha Chaturvedi , Tanveer A. Faruquie , Hima P. Karanam , Mukesh K. Mohania , L. Venkata Subramaniam

IPC: G06F17/30

CPC classification number: G06F16/217 , G06F16/215 , G06F16/2462

Abstract: According to one embodiment of the present invention, a system controls cleansing of data within a database system, and comprises a computer system including at least one processor. The system receives a data set from the database system, and one or more features of the data set are selected for determining values for one or more characteristics of the selected features. The determined values are applied to a data quality estimation model to determine data quality estimates for the data set. Problematic data within the data set are identified based on the data quality estimates, where the cleansing is adjusted to accommodate the identified problematic data. Embodiments of the present invention further include a method and computer program product for controlling cleansing of data within a database system in substantially the same manner described above.

Abstract translation: 根据本发明的一个实施例，系统控制数据库系统内的数据清理，并且包括包括至少一个处理器的计算机系统。系统从数据库系统接收数据集，并且选择数据集的一个或多个特征以确定所选特征的一个或多个特征的值。将确定的值应用于数据质量估计模型以确定数据集的数据质量估计。基于数据质量估计来识别数据集中的有问题的数据，其中调整清洁以适应所识别的有问题的数据。本发明的实施例还包括一种方法和计算机程序产品，用于以与上述基本相同的方式控制数据库系统内的数据清洗。

14.

发明申请
METHOD FOR SEGMENTING COMMUNICATION TRANSCRIPTS USING UNSUPERVISED AND SEMI-SUPERVISED TECHNIQUES 有权
Title translation: 使用不受限制的和受监督的技术分隔通信转录的方法

公开(公告)号：US20090112571A1

公开(公告)日：2009-04-30

申请号：US12060469

申请日：2008-04-01

Applicant: Krishna Kummamuru , Deepak S. Padmanabhan , Shourya Roy , L. Venkata Subramaniam

Inventor： Krishna Kummamuru , Deepak S. Padmanabhan , Shourya Roy , L. Venkata Subramaniam

IPC: G06F17/20

CPC classification number: G06F17/3071 , G10L15/04

Abstract: A method is provided for forming discrete segment clusters of one or more sequential sentences from a corpus of communication transcripts of transactional communications that comprises dividing the communication transcripts of the corpus into a first set of sentences spoken by a caller and a second set of sentences spoken by a responder; generating a set of sentence clusters by grouping the first and second sets of sentences according to a measure of lexical similarity using an unsupervised partitional clustering method; generating a collection of sequences of sentence types by assigning a distinct sentence type to each sentence cluster and representing each sentence of each communication transcript of the corpus with the sentence type assigned to the sentence cluster into which the sentence is grouped; and generating a specified number of discrete segment clusters by successively merging sentence clusters according to a proximity-based measure between the sentence types assigned to the sentence clusters within sequences of the collection.

Abstract translation: 提供了一种用于从事务通信的通信转录语料库形成一个或多个顺序句子的离散段聚类的方法，其包括将语料库的通信记录分成由呼叫者说出的第一组句子和第二组句子由答复者通过使用无监督分数聚类方法，根据词汇相似度的度量，对第一和第二组句子进行分组，从而生成一组句子群; 通过为每个句子集分配不同的句子类型并以分配给句子分组的句子集合的句子类型表示语料库的每个通信录音的每个句子来生成句子序列的集合; 以及通过根据在集合的序列内分配给句子集群的句子类型之间的基于邻近度的度量连续地合并语句集群来生成指定数量的离散分段集群。

15.

发明授权
Automatically mining patterns for rule based data standardization systems 有权

公开(公告)号：US10163063B2

公开(公告)日：2018-12-25

申请号：US13414374

申请日：2012-03-07

Applicant: Snigdha Chaturvedi , Tanveer A Faruquie , Hima P. Karanam , Marvin Mendelssohn , Mukesh K. Mohania , L. Venkata Subramaniam

Inventor： Snigdha Chaturvedi , Tanveer A Faruquie , Hima P. Karanam , Marvin Mendelssohn , Mukesh K. Mohania , L. Venkata Subramaniam

IPC: G06F17/30 , G06Q10/06 , G06Q10/10 , G06Q30/02

Abstract: Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.

16.

发明授权
Customer service analysis 有权
Title translation: 客户服务分析

公开(公告)号：US09118759B2

公开(公告)日：2015-08-25

申请号：US12855944

申请日：2010-08-13

Applicant: Raghuram Krishnapuram , L. Venkata Subramaniam

Inventor： Raghuram Krishnapuram , L. Venkata Subramaniam

IPC: H04M3/51

CPC classification number: H04M3/51 , H04M3/5175

Abstract: A method, a system and a computer program product for analyzing customer service quality is disclosed. A plurality of customer call service quality parameters is identified using historical data. The plurality of customer call service quality parameters is quantified and correlated. The customer service quality is analyzed using the plurality of customer call service quality parameters. A repository is generated using the historical data of a plurality of customer calls and a set of pre-defined customer call flow templates. A subset of service quality queries is identified using contextual information of the customer call from the repository of service quality queries. The subset of service quality queries is then interspersed in the customer call. The customer service quality is analyzed using responses to the subset of service quality queries.

Abstract translation: 公开了一种用于分析客户服务质量的方法，系统和计算机程序产品。使用历史数据来识别多个客户呼叫服务质量参数。多个客户呼叫服务质量参数被量化和相关。使用多个客户呼叫服务质量参数分析客户服务质量。使用多个客户呼叫的历史数据和一组预定义的客户呼叫流模板来生成存储库。使用来自服务质量查询库的客户呼叫的上下文信息来识别服务质量查询的子集。然后将服务质量查询的子集散布在客户呼叫中。使用对服务质量查询子集的响应来分析客户服务质量。

17.

发明授权
Rule set management 失效
Title translation: 规则集管理

公开(公告)号：US08700542B2

公开(公告)日：2014-04-15

申请号：US12969497

申请日：2010-12-15

Applicant: Mohan N. Dani , Tanveer A. Faruquie , Hima P. Karanam , L. Venkata Subramaniam , Girish Venkatachaliah

Inventor： Mohan N. Dani , Tanveer A. Faruquie , Hima P. Karanam , L. Venkata Subramaniam , Girish Venkatachaliah

IPC: G06F17/30

CPC classification number: G06N5/025

Abstract: Systems, methods, and computer products for optimally managing large rule sets are disclosed. Rule dependencies of rules within a set of rules may be determined as a function of rules execution frequency data generated from applying the rules over a data set. The rules within the set of rules may be clustered into rules clusters based on the determined rule dependencies, in which the rules clusters comprise disjoint subsets of the rules within the set of rules. Cluster frequency data for the rules clusters may be used to arrive at an optimal ordering. Each rule within the set of rules may be assigned a unique identification that may capture an execution order of the rules within the set of rules.

Abstract translation: 公开了用于最佳管理大规则集的系统，方法和计算机产品。一组规则中的规则的规则依赖性可以被确定为通过在数据集上应用规则而生成的规则执行频率数据的函数。基于所确定的规则依赖性，该组规则中的规则可以被聚集到规则集群中，其中规则集合包括规则集合内的规则的不相交的子集。可以使用规则集群的群集频率数据来获得最佳排序。该组规则中的每个规则可以被分配唯一的标识，其可以捕获规则集合内的规则的执行顺序。

18.

发明申请
Automatically Mining Patterns for Rule Based Data Standardization Systems 审中-公开
Title translation: 基于规则的数据标准化系统自动挖掘模式

公开(公告)号：US20130238611A1

公开(公告)日：2013-09-12

申请号：US13415144

申请日：2012-03-08

Applicant: Snigdha Chaturvedi , Tanveer A. Faruquie , Hima P. Karanam , Marvin Mendelssohn , Mukesh K. Mohania , L. Venkata Subramaniam

Inventor： Snigdha Chaturvedi , Tanveer A. Faruquie , Hima P. Karanam , Marvin Mendelssohn , Mukesh K. Mohania , L. Venkata Subramaniam

IPC: G06F17/30

CPC classification number: G06F17/30705 , G06F17/2775 , G06F17/30675 , G06F2216/03 , G06Q10/06 , G06Q10/10 , G06Q30/02

Abstract: Methods, computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.

Abstract translation: 提供方法，计算机程序产品和系统用于挖掘文本数据集中的子模式。这些实施例有助于找到数据集内的N个经常出现的子模式的集合，从数据集中提取N个子模式，并将所提取的子模式聚类成K个组，其中每个提取的子模式被放置在基于距离值D的与其他提取的子模式相同的组，其确定子模式和同一组内的每个其他子模式之间的相似度。

19.

发明申请
Automatically Mining Patterns For Rule Based Data Standardization Systems 审中-公开
Title translation: 自动挖掘基于规则的数据标准化系统的模式

公开(公告)号：US20130238610A1

公开(公告)日：2013-09-12

申请号：US13414374

申请日：2012-03-07

Applicant: Snigdha Chaturvedi , Tanveer A. Faruquie , Hima P. Karanam , Marvin Mendelssohn , Mukesh K. Mohania , L. Venkata Subramaniam

Inventor： Snigdha Chaturvedi , Tanveer A. Faruquie , Hima P. Karanam , Marvin Mendelssohn , Mukesh K. Mohania , L. Venkata Subramaniam

IPC: G06F17/30

CPC classification number: G06F17/30705 , G06F17/2775 , G06F17/30675 , G06F2216/03 , G06Q10/06 , G06Q10/10 , G06Q30/02 , Y04S10/54

Abstract: Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.

Abstract translation: 提供计算机程序产品和系统用于挖掘文本数据集中的子模式。这些实施例有助于找到数据集内的N个经常出现的子模式的集合，从数据集中提取N个子模式，并将所提取的子模式聚类成K个组，其中每个提取的子模式被放置在基于距离值D的与其他提取的子模式相同的组，其确定子模式和同一组内的每个其他子模式之间的相似度。

20.

发明申请
SYSTEMS AND METHODS FOR EFFICIENT DEVELOPMENT OF A RULE-BASED SYSTEM USING CROWD-SOURCING 失效
Title translation: 使用CROWD-SOURCING的基于规则的系统的有效开发的系统和方法

公开(公告)号：US20120221508A1

公开(公告)日：2012-08-30

申请号：US13036454

申请日：2011-02-28

Applicant: Snigdha Chaturvedi , Tanveer Afzal Faruquie , L. Venkata Subramaniam

Inventor： Snigdha Chaturvedi , Tanveer Afzal Faruquie , L. Venkata Subramaniam

IPC: G06F7/00 , G06F17/00

CPC classification number: G06F17/00 , G06F17/30 , G06F17/30303

Abstract: Described herein are methods, systems, apparatuses and products for efficient development of a rule-based system. An aspect provides a method including accessing data records; converting said data records to an intermediate form; utilizing intermediate forms to compute similarity scores for said data records; and selecting as an example to be provided for rule making at least one record of said data records having a maximum dissimilarity score indicative of dissimilarity to already considered examples.

Abstract translation: 这里描述了用于有效开发基于规则的系统的方法，系统，设备和产品。一方面提供了一种包括访问数据记录的方法; 将所述数据记录转换成中间形式; 利用中间形式来计算所述数据记录的相似度分数; 并且选择为规则提供用于规则制作所述数据记录的至少一个记录，其具有指示与已经考虑的示例的不相似性的最大不相似性分数。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification