专利检索 ap:("Educational Testing Service") AND inv:"Yao Qian" 第 1 页

1.

发明授权
End-to-end neural network based automated speech scoring 有权

公开(公告)号：US10937444B1

公开(公告)日：2021-03-02

申请号：US16196716

申请日：2018-11-20

申请人： Educational Testing Service

发明人： David Suendermann-Oeft , Lei Chen , Jidong Tao , Shabnam Ghaffarzadegan , Yao Qian

IPC分类号： G10L25/30 , G10L15/16 , G06F17/18 , G06N3/04 , G06F40/284 , G10L25/60

摘要： A system for end-to-end automated scoring is disclosed. The system includes a word embedding layer for converting a plurality of ASR outputs into input tensors; a neural network lexical model encoder receiving the input tensors; a neural network acoustic model encoder implementing AM posterior probability, word duration, mean value of pitch and mean value of intensity based on a plurality of cues; and a linear regression module, for receiving concatenated encoded features from the neural network lexical model encoder and the neural network acoustic model encoder.

2.

发明授权
Exploring ASR-free end-to-end modeling to improve spoken language understanding in a cloud-based dialog system 有权

公开(公告)号：US11222627B1

公开(公告)日：2022-01-11

申请号：US16197704

申请日：2018-11-21

申请人： Educational Testing Service

发明人： Yao Qian , Rutuja Ubale , Vikram Ramanarayanan , Patrick Lange , David Suendermann-Oeft , Keelan Evanini , Eugene Tsuprun

IPC分类号： G10L15/16 , G10L15/18 , G10L15/22 , G10L15/24 , G10L25/24

摘要： Systems and methods are provided for conducting a simulated conversation with a language learner include determining a first dialog state of the simulated conversation. First audio data corresponding to simulated speech based on the dialog state is transmitted. Second audio data corresponding to a variable length utterance spoken in response to the simulated speech is received. A fixed dimension vector is generated based on the variable length utterance. A semantic label is predicted for the variable-length utterance based on the fixed dimension vector. A second dialog state of the simulated conversation is determined based on the semantic label, and third audio data corresponding to simulated speech is transmitted based on the second dialog state.

3.

发明授权
Computer-implemented systems and methods for evaluating speech dialog system engagement via video 有权

公开(公告)号：US10592733B1

公开(公告)日：2020-03-17

申请号：US15600206

申请日：2017-05-19

申请人： Educational Testing Service

发明人： Vikram Ramanarayanan , David Suendermann-Oeft , Patrick Lange , Alexei V. Ivanov , Keelan Evanini , Yao Qian , Eugene Tsuprun , Hillary R. Molloy

IPC分类号： G10L15/00 , G06K9/00 , G10L15/25 , G10L15/22 , G10L15/02 , G10L15/30

摘要： Systems and methods are provided providing a spoken dialog system. Output is provided from a spoken dialog system that determines audio responses to a person based on recognized speech content from the person during a conversation between the person and the spoken dialog system. Video data associated with the person interacting with the spoken dialog system is received. A video engagement metric is derived from the video data, where the video engagement metric indicates a level of the person's engagement with the spoken dialog system.

4.

发明授权
Processor-implemented systems and methods for determining sound quality 有权

公开(公告)号：US10283142B1

公开(公告)日：2019-05-07

申请号：US15215649

申请日：2016-07-21

申请人： Educational Testing Service

发明人： Zhou Yu , Vikram Ramanarayanan , David Suendermann-Oeft , Xinhao Wang , Klaus Zechner , Lei Chen , Jidong Tao , Yao Qian

IPC分类号： G10L25/30 , G10L25/60 , G10L25/24 , G10L25/93

摘要： Systems and methods are provided for a processor-implemented method of analyzing quality of sound acquired via a microphone. An input metric is extracted from a sound recording at each of a plurality of time intervals. The input metric is provided at each of the time intervals to a neural network that includes a memory component, where the neural network provides an output metric at each of the time intervals, where the output metric at a particular time interval is based on the input metric at a plurality of time intervals other than the particular time interval using the memory component of the neural network. The output metric is aggregated from each of the time intervals to generate a score indicative of the quality of the sound acquired via the microphone.

5.

发明授权
Automatic turn-level language identification for code-switched dialog 有权

公开(公告)号：US11238844B1

公开(公告)日：2022-02-01

申请号：US16255220

申请日：2019-01-23

申请人： Educational Testing Service

发明人： Vikram Ramanarayanan , Robert Pugh , Yao Qian , David Suendermann-Oeft

IPC分类号： G10L15/05 , G10L15/00 , G10L25/24 , G10L15/22 , G10L15/06 , G10L15/16

摘要： Systems and methods for identifying a person's native language and/or non-native language based on code-switched text and/or speech, are presented. The systems may be trained using various methods. For example, a language identification system may be trained using one or more code-switched corpora. Text and/or speech features may be extracted from the corpora and used, in combination with a per-word language identify of the text and/or speech, to train at least one machine learner. Code-switched text and/or speech may be received and processed by extracting text and/or speech features. These features may be fed into the at least one machine learner to identify the person's native language.

6.

发明授权
Computer-implemented systems and methods for a crowd source-bootstrapped spoken dialog system 有权

公开(公告)号：US10607504B1

公开(公告)日：2020-03-31

申请号：US15272903

申请日：2016-09-22

申请人： Educational Testing Service

发明人： Vikram Ramanarayanan , David Suendermann-Oeft , Patrick Lange , Alexei V. Ivanov , Keelan Evanini , Yao Qian , Zhou Yu

IPC分类号： G09B19/04 , G10L15/22 , G10L15/18 , G10L15/06

摘要： Systems and methods are provided for implementing an educational dialog system. An initial task model is accessed that identifies a plurality of dialog states associated with a task, a language model configured to identify a response meaning associated with a received response, and a language understanding model configured to select a next dialog state based on the identified response meaning. The task is provided to a plurality of persons for training. The task model is updated by revising the language model and the language understanding model based on responses received to prompts of the provided task, and the updated task is provided to a student for development of speaking capabilities.

7.

发明授权
Detection of off-topic spoken responses using machine learning 有权

公开(公告)号：US11455999B1

公开(公告)日：2022-09-27

申请号：US16844439

申请日：2020-04-09

申请人： Educational Testing Service

发明人： Xinhao Wang , Su-Youn Yoon , Keelan Evanini , Klaus Zechner , Yao Qian

IPC分类号： G10L15/26 , G10L15/16 , G06N3/08

摘要： Data is received that encapsulates a spoken response to a prompt text comprising a string of words. Thereafter, the received data is transcribed into a string of words. The string of words is then compared with a prompt so that a similarity grid representation of the comparison can be generated that characterizes a level of similarity between the string of words in the spoken response and the string of words in the prompt text. The grid representation is then scored using at least one machine learning model. The score indicates a likelihood of the spoken response having been off-topic. Data providing the encapsulated score can then be provided. Related apparatus, systems, techniques and articles are also described.

8.

发明授权
Detection of plagiarized spoken responses using machine learning 有权

公开(公告)号：US11417339B1

公开(公告)日：2022-08-16

申请号：US16695348

申请日：2019-11-26

申请人： Educational Testing Service

发明人： Xinhao Wang , Keelan Evanini , Yao Qian , Klaus Zechner

IPC分类号： G10L15/26 , G10L15/197 , G10L25/51 , G10L15/16

摘要： Data is received that encapsulates a spoken response to a test question. Thereafter, the received data is transcribed into a string of words. The string of words is then compared with at least one source string so that a similarity grid representation of the comparison can be generated that characterizes a level of similarity between the string of words and the at least one source string. The grid representation is then scored using at least one machine learning model. The score indicates a likelihood of the spoken response having been plagiarized. Data providing the encapsulated score can then be provided. Related apparatus, systems, techniques and articles are also described.

9.

发明授权
Native language identification with time delay deep neural networks trained separately on native and non-native english corpora 有权

公开(公告)号：US10783873B1

公开(公告)日：2020-09-22

申请号：US16221980

申请日：2018-12-17

申请人： Educational Testing Service

发明人： Yao Qian , Keelan Evanini , Patrick Lange , Robert A. Pugh , Rutuja Ubale

IPC分类号： G10L15/00 , G06N3/04 , G10L15/16 , G10L25/78 , G06N3/08 , G09B19/04

摘要： Systems and methods for identifying a person's native language, are presented. A native language identification system, comprising a plurality of artificial neural networks, such as time delay deep neural networks, is provided. Respective artificial neural networks of the plurality of artificial neural networks are trained as universal background models, using separate native language and non-native language corpora. The artificial neural networks may be used to perform voice activity detection and to extract sufficient statistics from the respective language corpora. The artificial neural networks may use the sufficient statistics to estimate respective T-matrices, which may in turn be used to extract respective i-vectors. The artificial neural networks may use i-vectors to generate a multilayer perceptron model, which may be used to identify a person's native language, based on an utterance by the person in his or her non-native language.

10.

发明授权
Computer-implemented systems and methods for speaker recognition using a neural network 有权

公开(公告)号：US10008209B1

公开(公告)日：2018-06-26

申请号：US15273830

申请日：2016-09-23

申请人： Educational Testing Service

发明人： Yao Qian , Jidong Tao , David Suendermann-Oeft , Keelan Evanini , Alexei V. Ivanov , Vikram Ramanarayanan

IPC分类号： G10L15/00 , G10L17/18 , G10L17/08 , G10L17/20 , G10L15/16

CPC分类号： G10L17/18 , G10L15/005 , G10L15/16 , G10L17/04 , G10L17/08 , G10L17/20

摘要： Systems and methods are provided for providing voice authentication of a candidate speaker. Training data sets are accessed, where each training data set comprises data associated with a training speech sample of a speaker and a plurality of speaker metrics, where the plurality of speaker metrics include a native language of the speaker. The training data sets are used to train a neural network, where the data associated with each training speech sample is a training input to the neural network, and each of the plurality of speaker metrics is a training output to the neural network. Data associated with a speech sample is provided to the neural network to generate a vector that contains values for the plurality of speaker metrics, and the values contained in the vector are compared to values contained in a reference vector associated with a known person to determine whether the candidate speaker is the known person.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类