-
公开(公告)号:US09740680B1
公开(公告)日:2017-08-22
申请号:US14715421
申请日:2015-05-18
申请人: Google Inc.
发明人: Tomas Mikolov , Kai Chen , Gregory S. Corrado , Jeffrey A. Dean
CPC分类号: G06F17/2765 , G06F17/2785 , G06N99/005 , G10L15/06
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.
-
公开(公告)号:US09721214B1
公开(公告)日:2017-08-01
申请号:US15231534
申请日:2016-08-08
申请人: Google Inc.
发明人: Gregory S. Corrado , Kai Chen , Jeffrey A. Dean , Samy Bengio , Rajat Monga , Matthieu Devin
CPC分类号: G06N99/005 , G06K9/6256 , G06K9/6269 , G06N3/063 , G06N3/08 , G06N5/025 , G06N7/005 , G06N7/08
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a model using parameter server shards. One of the methods includes receiving, at a parameter server shard configured to maintain values of a disjoint partition of the parameters of the model, a succession of respective requests for parameter values from each of a plurality of replicas of the model; in response to each request, downloading a current value of each requested parameter to the replica from which the request was received; receiving a succession of uploads, each upload including respective delta values for each of the parameters in the partition maintained by the shard; and updating values of the parameters in the partition maintained by the parameter server shard repeatedly based on the uploads of delta values to generate current parameter values.
-
公开(公告)号:US09412065B1
公开(公告)日:2016-08-09
申请号:US14817745
申请日:2015-08-04
申请人: Google Inc.
发明人: Gregory S. Corrado , Kai Chen , Jeffrey A. Dean , Samy Bengio , Rajat Monga , Matthieu Devin
CPC分类号: G06N99/005 , G06K9/6256 , G06K9/6269 , G06N3/063 , G06N3/08 , G06N5/025 , G06N7/005 , G06N7/08
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a model using parameter server shards. One of the methods includes receiving, at a parameter server shard configured to maintain values of a disjoint partition of the parameters of the model, a succession of respective requests for parameter values from each of a plurality of replicas of the model; in response to each request, downloading a current value of each requested parameter to the replica from which the request was received; receiving a succession of uploads, each upload including respective delta values for each of the parameters in the partition maintained by the shard; and updating values of the parameters in the partition maintained by the parameter server shard repeatedly based on the uploads of delta values to generate current parameter values.
-
公开(公告)号:US20140279773A1
公开(公告)日:2014-09-18
申请号:US13802184
申请日:2013-03-13
申请人: Google Inc.
发明人: Kai Chen , Xiaodan Song , Gregory S. Corrado , Kun Zhang , Jeffrey A. Dean , Bahman Rabii
IPC分类号: G06N3/08
CPC分类号: G06N3/084 , G06F17/30707 , G06F17/30864 , G06N3/04 , G06N3/0427 , G06N3/08 , G06Q30/02
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scoring concept terms using a deep network. One of the methods includes receiving an input comprising a plurality of features of a resource, wherein each feature is a value of a respective attribute of the resource; processing each of the features using a respective embedding function to generate one or more numeric values; processing the numeric values to generate an alternative representation of the features of the resource, wherein processing the floating point values comprises applying one or more non-linear transformations to the floating point values; and processing the alternative representation of the input to generate a respective relevance score for each concept term in a pre-determined set of concept terms, wherein each of the respective relevance scores measures a predicted relevance of the corresponding concept term to the resource.
摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于使用深层网络评分概念术语。 所述方法之一包括接收包括资源的多个特征的输入,其中每个特征是所述资源的相应属性的值; 使用相应的嵌入功能处理每个特征以生成一个或多个数值; 处理所述数值以产生所述资源的特征的替代表示,其中处理所述浮点值包括将一个或多个非线性变换应用于所述浮点值; 以及处理所述输入的替代表示,以在预定概念术语集中为每个概念项产生相应的相关性得分,其中各个相关性分数中的每一个测量相应概念项与资源的预测相关性。
-
公开(公告)号:US09256807B1
公开(公告)日:2016-02-09
申请号:US13803642
申请日:2013-03-14
申请人: Google Inc.
CPC分类号: G06K9/627 , G06F17/30244 , G06F17/30247 , G06F17/30268 , G06F17/3079 , G06F17/30843 , G06K9/6215 , G06K9/6255 , G06K9/6256
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating labeled images. One of the methods includes selecting a plurality of candidate videos from videos identified in a response to a search query derived from a label for an object category; selecting one or more initial frames from each of the candidate videos; detecting one or more initial images of objects in the object category in the initial frames; for each initial frame including an initial image of an object in the object category, tracking the object through surrounding frames to identify additional images of the object; and selecting one or more images from the one or more initial images and one or more additional images as database images of objects belonging to the object category.
摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于生成标记图像。 方法之一包括从对于从对象类别的标签导出的搜索查询的响应中识别的视频中选择多个候选视频; 从每个候选视频中选择一个或多个初始帧; 检测初始帧中对象类别中的对象的一个或多个初始图像; 对于包括对象类别中的对象的初始图像的每个初始帧,通过周围帧跟踪对象以识别对象的附加图像; 以及从一个或多个初始图像和一个或多个附加图像中选择一个或多个图像作为属于对象类别的对象的数据库图像。
-
公开(公告)号:US09141916B1
公开(公告)日:2015-09-22
申请号:US13803779
申请日:2013-03-14
申请人: Google Inc.
发明人: Gregory S. Corrado , Kai Chen , Jeffrey A. Dean , Gary R. Holt , Julian P. Grady , Sharat Chikkerur , David W. Sculley
CPC分类号: G06N3/08 , G06N3/04 , G06N3/0454 , G06N3/084
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using embedded function with a deep network. One of the methods includes receiving an input comprising a plurality of features, wherein each of the features is of a different feature type; processing each of the features using a respective embedding function to generate one or more numeric values, wherein each of the embedding functions operates independently of each other embedding function, and wherein each of the embedding functions is used for features of a respective feature type; processing the numeric values using a deep network to generate a first alternative representation of the input, wherein the deep network is a machine learning model composed of a plurality of levels of non-linear operations; and processing the first alternative representation of the input using a logistic regression classifier to predict a label for the input.
摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于使用具有深度网络的嵌入式功能。 方法之一包括接收包括多个特征的输入,其中每个特征具有不同的特征类型; 使用相应的嵌入功能处理每个特征以生成一个或多个数值,其中每个嵌入功能独立于彼此嵌入功能操作,并且其中每个嵌入功能用于相应特征类型的特征; 使用深度网络处理所述数值以产生所述输入的第一替代表示,其中所述深度网络是由多个非线性操作级别组成的机器学习模型; 以及使用逻辑回归分类器处理输入的第一替代表示以预测输入的标签。
-
公开(公告)号:US09141906B2
公开(公告)日:2015-09-22
申请号:US13802184
申请日:2013-03-13
申请人: Google Inc.
发明人: Kai Chen , Xiaodan Song , Gregory S. Corrado , Kun Zhang , Jeffrey A. Dean , Bahman Rabii
CPC分类号: G06N3/084 , G06F17/30707 , G06F17/30864 , G06N3/04 , G06N3/0427 , G06N3/08 , G06Q30/02
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scoring concept terms using a deep network. One of the methods includes receiving an input comprising a plurality of features of a resource, wherein each feature is a value of a respective attribute of the resource; processing each of the features using a respective embedding function to generate one or more numeric values; processing the numeric values to generate an alternative representation of the features of the resource, wherein processing the floating point values comprises applying one or more non-linear transformations to the floating point values; and processing the alternative representation of the input to generate a respective relevance score for each concept term in a pre-determined set of concept terms, wherein each of the respective relevance scores measures a predicted relevance of the corresponding concept term to the resource.
摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于使用深层网络评分概念术语。 所述方法之一包括接收包括资源的多个特征的输入,其中每个特征是所述资源的相应属性的值; 使用相应的嵌入功能处理每个特征以生成一个或多个数值; 处理所述数值以产生所述资源的特征的替代表示,其中处理所述浮点值包括将一个或多个非线性变换应用于所述浮点值; 以及处理所述输入的替代表示以在预定概念术语集合中为每个概念项产生相应的相关性得分,其中各个相关性分数中的每一个测量相应概念项与所述资源的预测相关性。
-
8.
公开(公告)号:US09037464B1
公开(公告)日:2015-05-19
申请号:US13841640
申请日:2013-03-15
申请人: Google Inc.
发明人: Tomas Mikolov , Kai Chen , Gregory S. Corrado , Jeffrey A. Dean
CPC分类号: G06F17/2765 , G06F17/2785 , G06N99/005 , G10L15/06
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.
摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于计算单词的数字表示。 一种方法包括获得一组训练数据,其中训练数据集合包括单词序列; 在训练数据集上训练分类器和嵌入函数,其中训练嵌入函数包括获得的嵌入函数参数的训练值; 使用嵌入函数根据嵌入函数参数的训练值来处理词汇表中的每个单词以产生高维空间中词汇表中每个单词的相应数值表示; 并将词汇表中的每个单词与高维空间中单词的相应数字表示相关联。
-
公开(公告)号:US09514405B2
公开(公告)日:2016-12-06
申请号:US14860462
申请日:2015-09-21
申请人: Google Inc.
发明人: Kai Chen , Xiaodan Song , Gregory S. Corrado , Kun Zhang , Jeffrey A. Dean , Bahman Rabii
CPC分类号: G06N3/084 , G06F17/30707 , G06F17/30864 , G06N3/04 , G06N3/0427 , G06N3/08 , G06Q30/02
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scoring concept terms using a deep network. One of the methods includes receiving an input comprising a plurality of features of a resource, wherein each feature is a value of a respective attribute of the resource; processing each of the features using a respective embedding function to generate one or more numeric values; processing the numeric values to generate an alternative representation of the features of the resource, wherein processing the floating point values comprises applying one or more non-linear transformations to the floating point values; and processing the alternative representation of the input to generate a respective relevance score for each concept term in a pre-determined set of concept terms, wherein each of the respective relevance scores measures a predicted relevance of the corresponding concept term to the resource.
-
公开(公告)号:US09514404B1
公开(公告)日:2016-12-06
申请号:US14860497
申请日:2015-09-21
申请人: Google Inc.
发明人: Gregory S. Corrado , Kai Chen , Jeffrey A. Dean , Gary R. Holt , Julian P. Grady , Sharat Chikkerur , David W. Sculley, II
CPC分类号: G06N3/08 , G06N3/04 , G06N3/0454 , G06N3/084
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using embedded function with a deep network. One of the methods includes receiving an input comprising a plurality of features, wherein each of the features is of a different feature type; processing each of the features using a respective embedding function to generate one or more numeric values, wherein each of the embedding functions operates independently of each other embedding function, and wherein each of the embedding functions is used for features of a respective feature type; processing the numeric values using a deep network to generate a first alternative representation of the input, wherein the deep network is a machine learning model composed of a plurality of levels of non-linear operations; and processing the first alternative representation of the input using a logistic regression classifier to predict a label for the input.
-
-
-
-
-
-
-
-
-