Speech driven lip synthesis using viseme based hidden markov models
    3.
    发明授权
    Speech driven lip synthesis using viseme based hidden markov models 有权
    使用基于Viseme的隐马尔可夫模型的语音驱动唇形合成

    公开(公告)号:US06366885B1

    公开(公告)日:2002-04-02

    申请号:US09384763

    申请日:1999-08-27

    CPC classification number: G11B27/10 G10L2021/105 G11B27/031

    Abstract: A method of speech driven lip synthesis which applies viseme based training models to units of visual speech. The audio data is grouped into a smaller number of visually distinct visemes rather than the larger number of phonemes. These visemes then form the basis for a Hidden Markov Model (HMM) state sequence or the output nodes of a neural network. During the training phase, audio and visual features are extracted from input speech, which is then aligned according to the apparent viseme sequence with the corresponding audio features being used to calculate the HMM state output probabilities or the output of the neutral network. During the synthesis phase, the acoustic input is aligned with the most likely viseme HMM sequence (in the case of an HMM based model) or with the nodes of the network (in the case of a neural network based system), which is then used for animation.

    Abstract translation: 基于视觉训练模型的视觉语音单元的语音驱动唇形合成方法。 音频数据被分组为较少数量的视觉上不同的视角,而不是较大数量的音素。 这些视差然后形成了隐马尔可夫模型(HMM)状态序列或神经网络的输出节点的基础。 在训练阶段,从输入语音中提取音频和视觉特征,然后根据明显的视度序列对准音频特征,使用相应的音频特征来计算HMM状态输出概率或中性网络的输出。 在合成阶段期间,声输入与最可能的viseme HMM序列(在基于HMM的模型的情况下)或网络的节点(在基于神经网络的系统的情况下)对齐,然后使用 用于动画。

    METHOD AND APPARATUS FOR PERVASIVE AUTHENTICATION DOMAINS

    公开(公告)号:US20080141357A1

    公开(公告)日:2008-06-12

    申请号:US11932918

    申请日:2007-10-31

    CPC classification number: H04L63/08 H04L63/0428 H04L63/126

    Abstract: Methods and apparatus for enabling a Pervasive Authentication Domain. A Pervasive Authentication Domain allows many registered Pervasive Devices to obtain authentication credentials from a single Personal Authentication Gateway and to use these credentials on behalf of users to enable additional capabilities for the devices. It provides an arrangement for a user to store credentials in one device (the Personal Authentication Gateway), and then make use of those credentials from many authorized Pervasive Devices without re-entering the credentials. It provides a convenient way for a user to share credentials among many devices, particularly when it is not convenient to enter credentials as in a smart wristwatch environment. It further provides an arrangement for disabling access to credentials to devices that appear to be far from the Personal Authentication Gateway as measured by metrics such as communications signal strengths.

    System and method for microphone activation using visual speech cues
    6.
    发明授权
    System and method for microphone activation using visual speech cues 失效
    使用视觉语音提示的麦克风激活的系统和方法

    公开(公告)号:US06754373B1

    公开(公告)日:2004-06-22

    申请号:US09616229

    申请日:2000-07-14

    CPC classification number: G10L25/78 G06K9/00335 G10L15/24

    Abstract: A system for activating a microphone based on visual speech cues, in accordance with the invention, includes a feature tracker coupled to an image acquisition device. The feature tracker tracks features in an image of a user. A region of interest extractor is coupled to the feature tracker. The region of interest extractor extracts a region of interest from the image of the user. A visual speech activity detector is coupled to the region of interest extractor and measures changes in the region of interest to determine if a visual speech cue has been generated by the user. A microphone is turned on by the visual speech activity detector when a visual speech cue has been determined by the visual speech activity detector. Methods for activating a microphone based on visual speech cues are also included.

    Abstract translation: 根据本发明的用于基于视觉语音提示来激活麦克风的系统包括耦合到图像采集装置的特征跟踪器。 功能跟踪器跟踪用户图像中的功能。 感兴趣区域提取器耦合到特征跟踪器。 感兴趣区域提取器从用户的图像中提取感兴趣的区域。 视觉语音活动检测器耦合到感兴趣区域提取器,并测量感兴趣区域中的变化,以确定用户是否已经产生视觉语音提示。 当视觉语音活动检测器确定了视觉语音提示时,麦克风由视觉语音活动检测器接通。 还包括基于视觉语音提示激活麦克风的方法。

    Method and system for multi-client access to a dialog system
    7.
    发明授权
    Method and system for multi-client access to a dialog system 有权
    多客户端访问对话系统的方法和系统

    公开(公告)号:US06377913B1

    公开(公告)日:2002-04-23

    申请号:US09374026

    申请日:1999-08-13

    CPC classification number: G06F3/16

    Abstract: In accordance with the invention, a method and system for accessing a dialog system employing a plurality of different clients, includes providing a first client device for accessing a conversational system and presenting a command to the conversational system by converting the command to a form understandable to the conversational system. The command is interpreted by employing a mediator, a dialog manager and a multi-modal history to determine the intent of the command based on a context of the command. A second client device is determined based on a predetermined device preference stored in the conversational system. An application is abstracted to perform the command, and the results of the performance of the command are set to the second client device.

    Abstract translation: 根据本发明,一种用于访问采用多个不同客户端的对话系统的方法和系统包括:提供用于访问对话系统的第一客户端设备,并且通过将该命令转换成可理解为的形式向对话系统呈现命令 对话系统。 通过使用调解器,对话管理器和多模式历史来解释该命令,以基于命令的上下文来确定命令的意图。 基于存储在会话系统中的预定的设备偏好来确定第二客户端设备。 抽象应用程序来执行命令,并将命令的执行结果设置为第二个客户端设备。

    Method and system for noise-robust speech processing with cochlea
filters in an auditory model
    9.
    发明授权
    Method and system for noise-robust speech processing with cochlea filters in an auditory model 失效
    在听觉模型中使用耳蜗滤波器进行噪声鲁棒语音处理的方法和系统

    公开(公告)号:US5768474A

    公开(公告)日:1998-06-16

    申请号:US581288

    申请日:1995-12-29

    CPC classification number: G10L15/02 G10L15/20

    Abstract: A method for noise-robust speech processing with cochlea filters within a computer system is disclosed. This invention provides a method for producing feature vectors from a segment of speech, that is more robust to variations in the environment due to additive noise. A first output is produced by convolving a speech signal input with spatially dependent impulse responses that resemble cochlea filters. The temporal transient and the spatial transient of the first output is then enhanced by taking a time derivative and a spatial derivative, respectively, of the first output to produce a second output. Next, all the negative values of the second output are replaced with zeros. A feature vector is then obtained from each frame of the second output by a multiple resolution extraction. The parameters for the cochlea filters are finally optimized by minimizing the difference between a feature vector generated from a relatively noise-free speech signal input and a feature vector generated from a noisy speech signal input.

    Abstract translation: 公开了一种在计算机系统内使用耳蜗滤波器进行噪声鲁棒语音处理的方法。 本发明提供了一种用于从语音段产生特征向量的方法,其对于由于加性噪声引起的环境变化更加鲁棒。 通过将语音信号输入与类似于耳蜗滤波器的空间相关脉冲响应进行卷积来产生第一输出。 然后通过分别获得第一输出的时间导数和空间导数来增强第一输出的时间瞬态和空间瞬态以产生第二输出。 接下来,第二个输出的所有负值都被替换为零。 然后通过多分辨率提取从第二输出的每个帧获得特征向量。 通过最小化从相对无噪声语音信号输入产生的特征向量与从噪声语音信号输入产生的特征向量之间的差异,最终优化耳蜗滤波器的参数。

Patent Agency Ranking