Invention Grant
- Patent Title: Method and apparatus for building a language model
- Patent Title (中): 构建语言模型的方法和装置
-
Application No.: US14181263Application Date: 2014-02-14
-
Publication No.: US09396724B2Publication Date: 2016-07-19
- Inventor: Feng Rao , Li Lu , Bo Chen , Xiang Zhang , Shuai Yue , Lu Li
- Applicant: Tencent Technology (Shenzhen) Company Limited
- Applicant Address: CN Shenzhen, Guangdong Province
- Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
- Current Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
- Current Assignee Address: CN Shenzhen, Guangdong Province
- Agency: Morgan, Lewis & Bockius LLP
- Priority: CN201310207237 20130529
- Main IPC: G10L15/06
- IPC: G10L15/06 ; G10L15/183 ; G10L15/197

Abstract:
A method includes: acquiring data samples; performing categorized sentence mining in the acquired data samples to obtain categorized training samples for multiple categories; building a text classifier based on the categorized training samples; classifying the data samples using the text classifier to obtain a class vocabulary and a corpus for each category; mining the corpus for each category according to the class vocabulary for the category to obtain a respective set of high-frequency language templates; training on the templates for each category to obtain a template-based language model for the category; training on the corpus for each category to obtain a class-based language model for the category; training on the class vocabulary for each category to obtain a lexicon-based language model for the category; building a speech decoder according to an acoustic model, the class-based language model and the lexicon-based language model for any given field, and the data samples.
Public/Granted literature
- US20140358539A1 METHOD AND APPARATUS FOR BUILDING A LANGUAGE MODEL Public/Granted day:2014-12-04
Information query