Computing numeric representations of words in a high-dimensional space

    公开(公告)号:US09740680B1

    公开(公告)日:2017-08-22

    申请号:US14715421

    申请日:2015-05-18

    申请人: Google Inc.

    IPC分类号: G10L15/00 G06F17/27 G10L15/06

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.

    Scoring Concept Terms Using a Deep Network
    4.
    发明申请
    Scoring Concept Terms Using a Deep Network 有权
    使用深度网络评估概念术语

    公开(公告)号:US20140279773A1

    公开(公告)日:2014-09-18

    申请号:US13802184

    申请日:2013-03-13

    申请人: Google Inc.

    IPC分类号: G06N3/08

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scoring concept terms using a deep network. One of the methods includes receiving an input comprising a plurality of features of a resource, wherein each feature is a value of a respective attribute of the resource; processing each of the features using a respective embedding function to generate one or more numeric values; processing the numeric values to generate an alternative representation of the features of the resource, wherein processing the floating point values comprises applying one or more non-linear transformations to the floating point values; and processing the alternative representation of the input to generate a respective relevance score for each concept term in a pre-determined set of concept terms, wherein each of the respective relevance scores measures a predicted relevance of the corresponding concept term to the resource.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于使用深层网络评分概念术语。 所述方法之一包括接收包括资源的多个特征的输入,其中每个特征是所述资源的相应属性的值; 使用相应的嵌入功能处理每个特征以生成一个或多个数值; 处理所述数值以产生所述资源的特征的替代表示,其中处理所述浮点值包括将一个或多个非线性变换应用于所述浮点值; 以及处理所述输入的替代表示,以在预定概念术语集中为每个概念项产生相应的相关性得分,其中各个相关性分数中的每一个测量相应概念项与资源的预测相关性。

    Generating labeled images
    5.
    发明授权
    Generating labeled images 有权
    生成标记图像

    公开(公告)号:US09256807B1

    公开(公告)日:2016-02-09

    申请号:US13803642

    申请日:2013-03-14

    申请人: Google Inc.

    IPC分类号: G06K9/62 G06F17/30

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating labeled images. One of the methods includes selecting a plurality of candidate videos from videos identified in a response to a search query derived from a label for an object category; selecting one or more initial frames from each of the candidate videos; detecting one or more initial images of objects in the object category in the initial frames; for each initial frame including an initial image of an object in the object category, tracking the object through surrounding frames to identify additional images of the object; and selecting one or more images from the one or more initial images and one or more additional images as database images of objects belonging to the object category.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于生成标记图像。 方法之一包括从对于从对象类别的标签导出的搜索查询的响应中识别的视频中选择多个候选视频; 从每个候选视频中选择一个或多个初始帧; 检测初始帧中对象类别中的对象的一个​​或多个初始图像; 对于包括对象类别中的对象的初始图像的每个初始帧,通过周围帧跟踪对象以识别对象的附加图像; 以及从一个或多个初始图像和一个或多个附加图像中选择一个或多个图像作为属于对象类别的对象的数据库图像。

    Using embedding functions with a deep network
    6.
    发明授权
    Using embedding functions with a deep network 有权
    使用深层网络嵌入功能

    公开(公告)号:US09141916B1

    公开(公告)日:2015-09-22

    申请号:US13803779

    申请日:2013-03-14

    申请人: Google Inc.

    IPC分类号: G06F15/18 G06N99/00

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using embedded function with a deep network. One of the methods includes receiving an input comprising a plurality of features, wherein each of the features is of a different feature type; processing each of the features using a respective embedding function to generate one or more numeric values, wherein each of the embedding functions operates independently of each other embedding function, and wherein each of the embedding functions is used for features of a respective feature type; processing the numeric values using a deep network to generate a first alternative representation of the input, wherein the deep network is a machine learning model composed of a plurality of levels of non-linear operations; and processing the first alternative representation of the input using a logistic regression classifier to predict a label for the input.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于使用具有深度网络的嵌入式功能。 方法之一包括接收包括多个特征的输入,其中每个特征具有不同的特征类型; 使用相应的嵌入功能处理每个特征以生成一个或多个数值,其中每个嵌入功能独立于彼此嵌入功能操作,并且其中每个嵌入功能用于相应特征类型的特征; 使用深度网络处理所述数值以产生所述输入的第一替代表示,其中所述深度网络是由多个非线性操作级别组成的机器学习模型; 以及使用逻辑回归分类器处理输入的第一替代表示以预测输入的标签。

    Scoring concept terms using a deep network
    7.
    发明授权
    Scoring concept terms using a deep network 有权
    使用深度网络评分概念术语

    公开(公告)号:US09141906B2

    公开(公告)日:2015-09-22

    申请号:US13802184

    申请日:2013-03-13

    申请人: Google Inc.

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scoring concept terms using a deep network. One of the methods includes receiving an input comprising a plurality of features of a resource, wherein each feature is a value of a respective attribute of the resource; processing each of the features using a respective embedding function to generate one or more numeric values; processing the numeric values to generate an alternative representation of the features of the resource, wherein processing the floating point values comprises applying one or more non-linear transformations to the floating point values; and processing the alternative representation of the input to generate a respective relevance score for each concept term in a pre-determined set of concept terms, wherein each of the respective relevance scores measures a predicted relevance of the corresponding concept term to the resource.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于使用深层网络评分概念术语。 所述方法之一包括接收包括资源的多个特征的输入,其中每个特征是所述资源的相应属性的值; 使用相应的嵌入功能处理每个特征以生成一个或多个数值; 处理所述数值以产生所述资源的特征的替代表示,其中处理所述浮点值包括将一个或多个非线性变换应用于所述浮点值; 以及处理所述输入的替代表示以在预定概念术语集合中为每个概念项产生相应的相关性得分,其中各个相关性分数中的每一个测量相应概念项与所述资源的预测相关性。

    Computing numeric representations of words in a high-dimensional space
    8.
    发明授权
    Computing numeric representations of words in a high-dimensional space 有权
    在高维空间中计算单词的数值表示

    公开(公告)号:US09037464B1

    公开(公告)日:2015-05-19

    申请号:US13841640

    申请日:2013-03-15

    申请人: Google Inc.

    IPC分类号: G10L15/00 G06F17/28

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于计算单词的数字表示。 一种方法包括获得一组训练数据,其中训练数据集合包括单词序列; 在训练数据集上训练分类器和嵌入函数,其中训练嵌入函数包括获得的嵌入函数参数的训练值; 使用嵌入函数根据嵌入函数参数的训练值来处理词汇表中的每个单词以产生高维空间中词汇表中每个单词的相应数值表示; 并将词汇表中的每个单词与高维空间中单词的相应数字表示相关联。

    Using embedding functions with a deep network

    公开(公告)号:US09514404B1

    公开(公告)日:2016-12-06

    申请号:US14860497

    申请日:2015-09-21

    申请人: Google Inc.

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using embedded function with a deep network. One of the methods includes receiving an input comprising a plurality of features, wherein each of the features is of a different feature type; processing each of the features using a respective embedding function to generate one or more numeric values, wherein each of the embedding functions operates independently of each other embedding function, and wherein each of the embedding functions is used for features of a respective feature type; processing the numeric values using a deep network to generate a first alternative representation of the input, wherein the deep network is a machine learning model composed of a plurality of levels of non-linear operations; and processing the first alternative representation of the input using a logistic regression classifier to predict a label for the input.