SYSTEMS AND METHODS FOR STANDARDIZATION AND DE-DUPLICATION OF ADDRESSES USING TAXONOMY
    31.
    发明申请
    SYSTEMS AND METHODS FOR STANDARDIZATION AND DE-DUPLICATION OF ADDRESSES USING TAXONOMY 有权
    使用税收的地址标准化和失效的系统和方法

    公开(公告)号:US20120047179A1

    公开(公告)日:2012-02-23

    申请号:US12859607

    申请日:2010-08-19

    CPC classification number: G06F17/30961

    Abstract: Systems and associated methods for address standardization and applications related thereto are described. Embodiments exploit a common context in a taxonomy and a given address to detect and correct deviations in the address. Embodiments establish a possible path from a root of the taxonomy to a leaf in the taxonomy that can possibly generate a given address. Given a new address, embodiments use complete addresses, and/or segments or elements thereof, to compute the representations of the elements and find a closest matching leaf in the taxonomy. Embodiments then traverse the path to a root node to detect the agreement and disagreement between the path and the address entry. Taxonomical structured is thus used to detect, segregate and standardize the expected fields.

    Abstract translation: 描述用于地址标准化的系统和相关方法及其相关的应用。 实施例利用分类法和给定地址中的公共上下文来检测和纠正地址中的偏差。 实体建立了从分类的根到可能产生给定地址的分类中的叶的可行路径。 给定新的地址,实施例使用完整的地址和/或其部分或元素来计算元素的表示并在分类中找到最接近的匹配叶。 然后,实施例遍历到根节点的路径以检测路径和地址条目之间的协议和不一致。 因此,分类结构用于检测,分离和规范预期的领域。

    Automatic Taxonomy Enrichment
    32.
    发明申请
    Automatic Taxonomy Enrichment 有权
    自动分类法丰富

    公开(公告)号:US20110078158A1

    公开(公告)日:2011-03-31

    申请号:US12569117

    申请日:2009-09-29

    CPC classification number: G06F17/30734 G06F17/2785 G06F17/30737

    Abstract: Techniques for enriching a taxonomy using one or more additional taxonomies are provided. The techniques include receiving two or more taxonomies, wherein the two or more taxonomies comprise a destination taxonomy and one or more additional taxonomies, determining one or more relevant portions of the two or more taxonomies by identifying one or more common terms between the two or more taxonomies, importing one or more relevant portions from the one or more additional taxonomies into the destination taxonomy, and using the one or more imported taxonomy portions to enrich the destination taxonomy.

    Abstract translation: 提供了使用一个或多个附加分类法丰富分类法的技术。 这些技术包括接收两个或多个分类法,其中两个或多个分类法包括目的地分类法和一个或多个附加分类法,通过识别两个或更多个分类法之间的一个或多个共同术语来确定两个或多个分类法的一个或多个相关部分 将一个或多个相关部分从一个或多个附加分类法导入目的地分类,并使用一个或多个导入的分类部分来丰富目标分类。

    METHOD FOR AUTOMATICALLY IDENTIFYING SENTENCE BOUNDARIES IN NOISY CONVERSATIONAL DATA
    33.
    发明申请
    METHOD FOR AUTOMATICALLY IDENTIFYING SENTENCE BOUNDARIES IN NOISY CONVERSATIONAL DATA 有权
    自动识别语音对话数据中的声界边界的方法

    公开(公告)号:US20090063150A1

    公开(公告)日:2009-03-05

    申请号:US11845462

    申请日:2007-08-27

    CPC classification number: G10L15/26

    Abstract: Sentence boundaries in noisy conversational transcription data are automatically identified. Noise and transcription symbols are removed, and a training set is formed with sentence boundaries marked based on long silences or on manual markings in the transcribed data. Frequencies of head and tail n-grams that occur at the beginning and ending of sentences are determined from the training set. N-grams that occur a significant number of times in the middle of sentences in relation to their occurrences at the beginning or ending of sentences are filtered out. A boundary is marked before every head n-gram and after every tail n-gram occurring in the conversational data and remaining after filtering. Turns are identified. A boundary is marked after each turn, unless the turn ends with an impermissible tail word or is an incomplete turn. The marked boundaries in the conversational data identify sentence boundaries.

    Abstract translation: 嘈杂会话转录数据中的句子边界自动识别。 删除噪声和转录符号,并且形成一个训练集,其中以基于长期沉默或手写标记的转录数据标记的句子边界。 从训练集确定在句子的开头和结尾出现的头和尾n-gram的频率。 在句子中间出现相当于句子开头或结尾的出现次数的N-gram被过滤掉。 在每个头n-gram之前和之后的每个尾部n-gram出现在对话数据中并且在过滤之后保留边界。 确认车辙。 每转后,边界都会被标记出来,除非转弯以不允许的尾字结束,或者是不完整的转弯。 会话数据中的标记边界识别句子边界。

    Speech driven lip synthesis using viseme based hidden markov models
    34.
    发明授权
    Speech driven lip synthesis using viseme based hidden markov models 有权
    使用基于Viseme的隐马尔可夫模型的语音驱动唇形合成

    公开(公告)号:US06366885B1

    公开(公告)日:2002-04-02

    申请号:US09384763

    申请日:1999-08-27

    CPC classification number: G11B27/10 G10L2021/105 G11B27/031

    Abstract: A method of speech driven lip synthesis which applies viseme based training models to units of visual speech. The audio data is grouped into a smaller number of visually distinct visemes rather than the larger number of phonemes. These visemes then form the basis for a Hidden Markov Model (HMM) state sequence or the output nodes of a neural network. During the training phase, audio and visual features are extracted from input speech, which is then aligned according to the apparent viseme sequence with the corresponding audio features being used to calculate the HMM state output probabilities or the output of the neutral network. During the synthesis phase, the acoustic input is aligned with the most likely viseme HMM sequence (in the case of an HMM based model) or with the nodes of the network (in the case of a neural network based system), which is then used for animation.

    Abstract translation: 基于视觉训练模型的视觉语音单元的语音驱动唇形合成方法。 音频数据被分组为较少数量的视觉上不同的视角,而不是较大数量的音素。 这些视差然后形成了隐马尔可夫模型(HMM)状态序列或神经网络的输出节点的基础。 在训练阶段,从输入语音中提取音频和视觉特征,然后根据明显的视度序列对准音频特征,使用相应的音频特征来计算HMM状态输出概率或中性网络的输出。 在合成阶段期间,声输入与最可能的viseme HMM序列(在基于HMM的模型的情况下)或网络的节点(在基于神经网络的系统的情况下)对齐,然后使用 用于动画。

    Automatic taxonomy enrichment
    35.
    发明授权
    Automatic taxonomy enrichment 有权
    自动分类法丰富

    公开(公告)号:US09069848B2

    公开(公告)日:2015-06-30

    申请号:US12569117

    申请日:2009-09-29

    CPC classification number: G06F17/30734 G06F17/2785 G06F17/30737

    Abstract: Techniques for enriching a taxonomy using one or more additional taxonomies are provided. The techniques include receiving two or more taxonomies, wherein the two or more taxonomies comprise a destination taxonomy and one or more additional taxonomies, determining one or more relevant portions of the two or more taxonomies by identifying one or more common terms between the two or more taxonomies, importing one or more relevant portions from the one or more additional taxonomies into the destination taxonomy, and using the one or more imported taxonomy portions to enrich the destination taxonomy.

    Abstract translation: 提供了使用一个或多个附加分类法丰富分类法的技术。 这些技术包括接收两个或多个分类法,其中两个或多个分类法包括目的地分类法和一个或多个附加分类法,通过识别两个或更多个分类法之间的一个或多个共同术语来确定两个或多个分类法的一个或多个相关部分 将一个或多个相关部分从一个或多个附加分类法导入目的地分类,并使用一个或多个导入的分类部分来丰富目标分类。

    Automatic selection of blocking column for de-duplication
    36.
    发明授权
    Automatic selection of blocking column for de-duplication 失效
    自动选择用于重复数据删除的阻止列

    公开(公告)号:US08560505B2

    公开(公告)日:2013-10-15

    申请号:US13313518

    申请日:2011-12-07

    CPC classification number: G06F17/30303

    Abstract: Blocking column selection can include determining a first parameter for each column set of a plurality of column sets, wherein the first parameter indicates distribution of blocks in the column set, and determining a second parameter for each column set. The second parameter can indicate block size for the column set. For each column set, a measure of blockability that is dependent upon at least the first parameter and the second parameter can be calculated using a processor. The plurality of column sets can be ranked according to the measures of blockability.

    Abstract translation: 阻塞列选择可以包括确定多个列集合的每个列集合的第一参数,其中第一参数指示列集合中的块的分布,以及为每个列集合确定第二参数。 第二个参数可以指示列集的块大小。 对于每个列集合,可以使用处理器来计算取决于至少第一参数和第二参数的可阻止性的度量。 可以根据阻塞性的测量对多个列集进行排序。

    Translingual visual speech synthesis
    38.
    发明授权
    Translingual visual speech synthesis 失效
    横向视觉语音综合

    公开(公告)号:US06813607B1

    公开(公告)日:2004-11-02

    申请号:US09494582

    申请日:2000-01-31

    CPC classification number: G10L13/00 G10L15/00 G10L21/06 G10L2021/105

    Abstract: A computer implemented method in a language independent system generates audio-driven facial animation given the speech recognition system for just one language. The method is based on the recognition that once alignment is generated, the mapping and the animation hardly have any language dependency in them. Translingual visual speech synthesis can be achieved if the first step of alignment generation can be made speech independent. Given a speech recognition system for a base language, the method synthesizes video with speech of any novel language as the input.

    Abstract translation: 语言独立系统中的计算机实现的方法产生音频驱动的面部动画,给出仅一种语言的语音识别系统。 该方法基于识别一旦生成对齐,映射和动画在它们中几乎没有任何语言依赖关系。 如果可以使语音不依赖于对准生成的第一步,则可以实现视觉语音合成。 给定基本语言的语音识别系统,该方法以任何新颖语言的语音合成视频作为输入。

Patent Agency Ranking