专利检索 ap:("Jianfeng Gao" OR "William B. Dolan" OR "Hsiao-Wuen Hon" OR "Ming Zhou") AND inv:"Jianfeng Gao" 第 6 页

51.

发明授权
Context modeling architecture and framework 有权
标题翻译：上下文建模架构和框架

公开(公告)号：US07783588B2

公开(公告)日：2010-08-24

申请号：US11253866

申请日：2005-10-19

申请人： William D. Ramsey , Jianfeng Gao , Sanjeev Katariya

发明人： William D. Ramsey , Jianfeng Gao , Sanjeev Katariya

IPC分类号： G06F17/00 , G06N5/02

CPC分类号： G06N99/005 , G06F9/453

摘要： A context modeling architecture that includes a context representation portion, which adapted to represent context as features, is provided. The features are specifiable at runtime of an application including the context representation portion.

摘要翻译： 提供了一种包括上下文表示部分的上下文建模体系结构，其适用于将上下文表示为特征。这些特征在包括上下文表示部分的应用的运行时是可指定的。

52.

发明授权
Automatic evaluation of summaries 有权
标题翻译：自动评估摘要

公开(公告)号：US07725442B2

公开(公告)日：2010-05-25

申请号：US11672038

申请日：2007-02-06

申请人： Chin-Yew Lin , Jianfeng Gao , Guihong Cao , Jian-Yun Nie

发明人： Chin-Yew Lin , Jianfeng Gao , Guihong Cao , Jian-Yun Nie

IPC分类号： G06F7/00 , G06F17/30

CPC分类号： G06F17/30719

摘要： A probability distribution for a reference summary of a document is determined. The probability distribution for the reference summary is then used to generate a score for a machine-generated summary of the document.

摘要翻译： 确定文档参考摘要的概率分布。然后使用参考摘要的概率分布来生成机器生成的文档摘要的分数。

53.

发明申请
TRAINING A SEARCH RESULT RANKER WITH AUTOMATICALLY-GENERATED SAMPLES 有权
标题翻译：用自动生成样本培养搜索结果排名

公开(公告)号：US20100082510A1

公开(公告)日：2010-04-01

申请号：US12243359

申请日：2008-10-01

申请人： Jianfeng Gao , Kuansan Wang

发明人： Jianfeng Gao , Kuansan Wang

IPC分类号： G06F15/18 , G06F7/06 , G06F17/30

CPC分类号： G06N99/005 , G06F17/3053

摘要： A search result ranker may be trained with automatically-generated samples. In an example embodiment, user interests are inferred from user interactions with search results for a particular query so as to determine respective relevance scores associated with respective query-identifier pairs of the search results. Query-identifier-relevance score triplets are formulated from the respective relevance scores associated with the respective query-identifier pairs. The query-identifier-relevance score triplets are submitted as training samples to a search result ranker. The search result ranker is trained as a learning machine with multiple training samples of the query-identifier-relevance score triplets.

摘要翻译： 搜索结果筛选器可以用自动生成的样本进行训练。在一个示例性实施例中，用户兴趣从用户与特定查询的搜索结果的交互推断，以便确定与搜索结果的相应查询 - 标识符对相关联的相应关联度得分。查询标识符 - 相关性分数三元组由与相应查询 - 标识符对相关联的各个相关性得分制定。查询标识符 - 相关性分数三元组作为训练样本提交给搜索结果筛选器。搜索结果筛选器被训练为具有查询标识符相关性分数三元组的多个训练样本的学习机器。

54.

发明申请
RANKER SELECTION FOR STATISTICAL NATURAL LANGUAGE PROCESSING 有权
标题翻译：用于统计自然语言处理的排名选择

公开(公告)号：US20090125501A1

公开(公告)日：2009-05-14

申请号：US11938811

申请日：2007-11-13

申请人： Jianfeng Gao , Galen Andrew , Mark Johnson , Kristina Toutanova

发明人： Jianfeng Gao , Galen Andrew , Mark Johnson , Kristina Toutanova

IPC分类号： G06F7/10

CPC分类号： G06F17/2715

摘要： Systems and methods for selecting a ranker for statistical natural language processing are provided. One disclosed system includes a computer program configured to be executed on a computing device, the computer program comprising a data store including reference performance data for a plurality of candidate rankers, the reference performance data being calculated based on a processing of test data by each of the plurality of candidate rankers. The system may further include a ranker selector configured to receive a statistical natural language processing task and a performance target, and determine a selected ranker from the plurality of candidate rankers based on the statistical natural language processing task, the performance target, and the reference performance data.

摘要翻译： 提供了用于选择用于统计自然语言处理的游戏者的系统和方法。一种公开的系统包括被配置为在计算设备上执行的计算机程序，该计算机程序包括数据存储器，该数据存储器包括用于多个候选排名者的参考演出数据，该参考演出数据是基于每个测试数据的处理来计算的多个候选排名。该系统可以进一步包括配置成接收统计自然语言处理任务和性能目标的排队选择器，并且基于统计自然语言处理任务，性能目标和参考性能来确定来自多个候选排名者的选定队员数据。

55.

发明申请
LIMITED-MEMORY QUASI-NEWTON OPTIMIZATION ALGORITHM FOR L1-REGULARIZED OBJECTIVES 有权
标题翻译：用于L1规范化目标的有限存储器QUASI-NEWTON优化算法

公开(公告)号：US20090106173A1

公开(公告)日：2009-04-23

申请号：US11874199

申请日：2007-10-17

申请人： Galen Andrew , Jianfeng Gao

发明人： Galen Andrew , Jianfeng Gao

IPC分类号： G06F15/18

CPC分类号： G06N99/005

摘要： An algorithm that employs modified methods developed for optimizing differential functions but which can also handle the special non-differentiabilities that occur with the L1-regularization. The algorithm is a modification of the L-BFGS (limited-memory Broyden-Fletcher-Goldfarb-Shanno) quasi-Newton algorithm, but which can now handle the discontinuity of the gradient using a procedure that chooses a search direction at each iteration and modifies the line search procedure. The algorithm includes an iterative optimization procedure where each iteration approximately minimizes the objective over a constrained region of the space on which the objective is differentiable (in the case of L1-regularization, a given orthant), models the second-order behavior of the objective by considering the loss component alone, using a “line-search” at each iteration that projects search points back onto the chosen orthant, and determines when to stop the line search.

摘要翻译： 一种使用为优化差分功能而开发的修改方法的算法，但也可以处理L1正则化发生的特殊非差异性。该算法是L-BFGS（有限存储器Broyden-Fletcher-Goldfarb-Shanno）准牛顿算法的修改，但现在可以使用在每次迭代中选择搜索方向的过程来处理梯度的不连续性，并且修改线搜索程序。该算法包括一个迭代优化过程，其中每次迭代大致使目标在目标可微分的空间的约束区域（在L1正则化的情况下，给定的不对称）下的目标最小化，对目标的二阶行为进行建模通过考虑单独的损失组件，在每次迭代时使用“线搜索”来将搜索点投射回所选择的不同，并确定何时停止线搜索。

56.

发明申请
Finite-state model for processing web queries 失效
标题翻译：用于处理Web查询的有限状态模型

公开(公告)号：US20080183673A1

公开(公告)日：2008-07-31

申请号：US11698011

申请日：2007-01-25

申请人： Jianfeng Gao , Qi Yao , Ji-Rong Wen

发明人： Jianfeng Gao , Qi Yao , Ji-Rong Wen

IPC分类号： G06F17/30

CPC分类号： G06F17/30864

摘要： A method of creating an index of web queries is discussed. The method includes receiving a first query representative of one or more symbolic characters and assigning the first query to a first data structure. A first text string representative of the first query is created and assigned to a second data structure. The first and second data structures are stored on a tangible computer readable medium.

摘要翻译： 讨论了创建Web查询索引的方法。该方法包括接收表示一个或多个符号字符的第一查询，并将第一查询分配给第一数据结构。创建表示第一查询的第一文本串并将其分配给第二数据结构。第一和第二数据结构存储在有形的计算机可读介质上。

57.

发明授权
System and method for joint optimization of language model performance and size 有权
标题翻译：联合优化语言模型性能和尺寸的系统和方法

公开(公告)号：US07275029B1

公开(公告)日：2007-09-25

申请号：US09607786

申请日：2000-06-30

申请人： Jianfeng Gao , Kai-Fu Lee , Mingjing Li , Hai-Feng Wang , Dong-Feng Cai , Lee-Feng Chien

发明人： Jianfeng Gao , Kai-Fu Lee , Mingjing Li , Hai-Feng Wang , Dong-Feng Cai , Lee-Feng Chien

IPC分类号： G06F17/27

CPC分类号： G06F17/2735 , G06F17/274 , G06F17/2818

摘要： A method for the joint optimization of language model performance and size is presented comprising developing a language model from a tuning set of information, segmenting at least a subset of a received textual corpus and calculating a perplexity value for each segment and refining the language model with one or more segments of the received corpus based, at least in part, on the calculated perplexity value for the one or more segments.

摘要翻译： 提出了一种用于联合优化语言模型性能和大小的方法，包括从调整的信息集开发语言模型，分割所接收的文本语料库的至少一个子集，并计算每个分段的困惑度值，并用至少部分地基于所计算的一个或多个段的困惑度值，所接收的语料库的一个或多个段。

58.

发明授权
Method and apparatus for distribution-based language model adaptation 有权
标题翻译：基于分布式语言模型适应的方法和装置

公开(公告)号：US07254529B2

公开(公告)日：2007-08-07

申请号：US11225543

申请日：2005-09-13

申请人： Jianfeng Gao , Mingjing Li

发明人： Jianfeng Gao , Mingjing Li

IPC分类号： G06F17/27 , G06F17/28 , G10L15/00

CPC分类号： G06F17/2715 , G10L15/065 , G10L15/1815

摘要： A method and apparatus are provided for adapting a language model to a task-specific domain. Under the method and apparatus, the relative frequency of n-grams in a small training set (i.e. task-specific training data set) and the relative frequency of n-grams in a large training set (i.e. out-of-domain training data set) are used to weight a distribution count of n-grams in the large training set. The weighted distributions are then used to form a modified language model by identifying probabilities for n-grams from the weighted distributions.

摘要翻译： 提供了一种用于使语言模型适应于任务特定领域的方法和装置。在该方法和装置下，小训练集中的n-gram的相对频率（即任务特定的训练数据集）和大训练集中的n-gram的相对频率（即，域外训练数据集）用于在大训练集中加权n-g的分布计数。然后通过从加权分布中识别n克的概率，将加权分布用于形成修改后的语言模型。

59.

发明授权
Method and apparatus for compressing asymmetric clustering language models 有权
标题翻译：用于压缩非对称聚类语言模型的方法和装置

公开(公告)号：US07231349B2

公开(公告)日：2007-06-12

申请号：US10448498

申请日：2003-05-30

申请人： Mu Li , Jianfeng Gao

发明人： Mu Li , Jianfeng Gao

IPC分类号： G01L15/06

CPC分类号： G10L15/197 , G06F17/277 , G10L15/285

摘要： A method and data structure are provided for efficiently storing asymmetric clustering models. The models are stored by storing a first level record for a word identifier and two second level records, one for a word identifier and one for a cluster identifier. An index to the second level word record and an index to the second level cluster record are stored in the first level record. Many of the records in the data structure include both cluster sub-model parameters and word sub-model parameters.

摘要翻译： 提供了一种方法和数据结构，用于有效地存储非对称聚类模型。通过存储用于字标识符的第一级记录和两个第二级记录来存储模型，一个用于字标识符，一个用于集群标识符。第二级记录的索引和第二级集群记录的索引存储在第一级记录中。数据结构中的许多记录包括集群子模型参数和单词子模型参数。

60.

发明申请
Context modeling architecture and framework 有权
标题翻译：上下文建模架构和框架

公开(公告)号：US20070112546A1

公开(公告)日：2007-05-17

申请号：US11253866

申请日：2005-10-19

申请人： William Ramsey , Jianfeng Gao , Sanjeev Katariya

发明人： William Ramsey , Jianfeng Gao , Sanjeev Katariya

IPC分类号： G06F17/10

CPC分类号： G06N99/005 , G06F9/453

摘要： A context modeling architecture that includes a context representation portion, which adapted to represent context as features, is provided. The features are specifiable at runtime of an application including the context representation portion.

摘要翻译： 提供了一种包括上下文表示部分的上下文建模体系结构，其适用于将上下文表示为特征。这些特征在包括上下文表示部分的应用的运行时是可指定的。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类