Food recognition using visual analysis and speech recognition
    1.
    发明授权
    Food recognition using visual analysis and speech recognition 有权
    食物识别使用视觉分析和语音识别

    公开(公告)号:US08439683B2

    公开(公告)日:2013-05-14

    申请号:US12683124

    申请日:2010-01-06

    IPC分类号: G09B19/00

    CPC分类号: G09B19/0092

    摘要: A method and system for analyzing at least one food item on a food plate is disclosed. A plurality of images of the food plate is received by an image capturing device. A description of the at least one food item on the food plate is received by a recognition device. The description is at least one of a voice description and a text description. At least one processor extracts a list of food items from the description; classifies and segments the at least one food item from the list using color and texture features derived from the plurality of images; and estimates the volume of the classified and segmented at least one food item. The processor is also configured to estimate the caloric content of the at least one food item.

    摘要翻译: 公开了一种用于分析食品板上的至少一种食品的方法和系统。 食品牌的多个图像由图像捕获装置接收。 对食品牌上的至少一种食品的描述由识别装置接收。 描述是语音描述和文本描述中的至少一个。 至少一个处理器从描述中提取食物列表; 使用从多个图像导出的颜色和纹理特征来对列表中的至少一个食物项进行分类和分割; 并估计分类和分段的至少一种食品的体积。 处理器还被配置为估计至少一个食物的热量含量。

    METHOD FOR COMPUTING FOOD VOLUME IN A METHOD FOR ANALYZING FOOD
    2.
    发明申请
    METHOD FOR COMPUTING FOOD VOLUME IN A METHOD FOR ANALYZING FOOD 有权
    用于分析食物的方法中的食物体积的方法

    公开(公告)号:US20110182477A1

    公开(公告)日:2011-07-28

    申请号:US12758208

    申请日:2010-04-12

    IPC分类号: G06K9/00

    摘要: A computer-implemented method for estimating a volume of at least one food item on a food plate is disclosed. A first and second plurality of images are received from different positions above a food plate, wherein angular spacing between the positions of the first plurality of images is greater than angular spacing between the positions of the second plurality of images. A first set of poses of each of the first plurality of images is estimated. A second set of poses of each of the second plurality of images is estimated based on at least the first set of poses. A pair of images taken from each of the first and second plurality of images is rectified based on at least the first and second set of poses. A 3D point cloud is reconstructed based on at least the rectified pair of images. At least one surface of the at least one food item above the food plate is estimated based on at least the reconstructed 3D point cloud. The volume of the at least one food item is estimated based on the at least one surface.

    摘要翻译: 公开了一种用于估计食品板上的至少一种食品的体积的计算机实现的方法。 从食品牌上方的不同位置接收第一和第二多个图像,其中第一多个图像的位置之间的角度间隔大于第二多个图像的位置之间的角度间隔。 估计第一多个图像中的每一个的第一组姿势。 基于至少第一组姿势来估计第二组多个图像中的每一个的第二组姿势。 从第一和第二多个图像中的每一个拍摄的一对图像至少基于第一和第二组姿势进行整改。 至少基于整流图像对来重构3D点云。 基于至少重构的3D点云来估计食物板上方的至少一个食物的至少一个表面。 基于至少一个表面来估计至少一个食物的体积。

    AUDIO BASED ROBOT CONTROL AND NAVIGATION
    3.
    发明申请
    AUDIO BASED ROBOT CONTROL AND NAVIGATION 有权
    基于音频的机器人控制和导航

    公开(公告)号:US20110077813A1

    公开(公告)日:2011-03-31

    申请号:US12892048

    申请日:2010-09-28

    IPC分类号: G05D1/00

    CPC分类号: G05D1/0246 G05D1/0251

    摘要: A computer implemented method for unattended detection of a current terrain to be traversed by a mobile device is disclosed. Visual input of the current terrain is received for a plurality of positions. Audio input corresponding to the current terrain is received for the plurality of positions. The video input is fused with the audio input using a classifier. The type of the current terrain is classified with the classifier. The classifier may also be employed to predict the type of terrain proximal to the current terrain. The classifier is constructed using an expectation-maximization (EM) method.

    摘要翻译: 公开了一种用于无人值守检测由移动设备穿过的当前地形的计算机实现的方法。 针对多个位置接收当前地形的视觉输入。 针对多个位置接收与当前地形相对应的音频输入。 使用分类器将视频输入与音频输入进行融合。 当前地形的类型与分类器分类。 分类器也可以用于预测当前地形附近的地形类型。 分类器是使用期望最大化(EM)方法构建的。

    Method and system for segmenting videos using face detection
    4.
    发明授权
    Method and system for segmenting videos using face detection 失效
    使用人脸检测分割视频的方法和系统

    公开(公告)号:US07555149B2

    公开(公告)日:2009-06-30

    申请号:US11258590

    申请日:2005-10-25

    IPC分类号: G06K9/00

    摘要: A method generates a summary of a video. Faces are detected in a plurality of frames of the video. The frames are classified according to a number of faces detected in each frame and the video is partitioned into segments according to the classifications to produce a summary of the video. For each frame classified as having a single detected face, one or more characteristics of the face is determined. The frames are labeled according to the characteristics to produce labeled clusters and the segments are partitioned into sub-segments according to the labeled clusters.

    摘要翻译: 一种方法生成视频的摘要。 在视频的多个帧中检测到脸部。 帧根据在每个帧中检测到的多个面部进行分类,并且根据分类将视频划分成段,以产生视频的摘要。 对于被分类为具有单个检测面的每个帧,确定面部的一个或多个特征。 根据特征标记帧以产生标记的簇,并且根据标记的簇将片段划分成子片段。

    Visual complexity measure for playing videos adaptively
    5.
    发明授权
    Visual complexity measure for playing videos adaptively 失效
    视觉复杂度自适应播放视频

    公开(公告)号:US07406123B2

    公开(公告)日:2008-07-29

    申请号:US10616546

    申请日:2003-07-10

    IPC分类号: H04N7/12

    摘要: A method plays frames of a video adaptively according to a visual complexity of the video. First a spatial frequency of pixel within frames of the video is measured, as well as a temporal velocity of corresponding pixels between frames of the video. The spatial frequency is multiplied by the temporal velocity to obtain a measure of the visual complexity of the frames of the video. The frames of the video are then played at a speed that corresponds to the visual complexity.

    摘要翻译: 一种方法根据视频的视觉复杂性自适应地播放视频的帧。 首先,测量视频帧内像素的空间频率,以及视频帧之间对应像素的时间速度。 空间频率乘以时间速度,以获得视频帧的视觉复杂度的度量。 然后以与视觉复杂度相对应的速度播放视频的帧。

    Method for representing and comparing multimedia content according to rank
    6.
    发明授权
    Method for representing and comparing multimedia content according to rank 失效
    根据等级表示和比较多媒体内容的方法

    公开(公告)号:US07383504B1

    公开(公告)日:2008-06-03

    申请号:US09518937

    申请日:2000-03-06

    IPC分类号: G06T11/20 G06K9/45 G06F3/00

    CPC分类号: G06K9/00711 G06K9/6878

    摘要: A method for generating a representation of multimedia content by first segmenting the multimedia content spatially and temporally to extract objects. Feature extraction is applied to the objects to produce semantic and syntactic attributes, relations, and a containment set of content entities. The content entities are coded to produce directed acyclic graphs of the content entities, where each directed acyclic graph represents a particular interpretation of the multimedia content. Attributes of each content entity are measured and the measured attributes are assigned to each corresponding content entity in the directed acyclic graphs to rank order the multimedia content.

    摘要翻译: 一种用于通过首先在空间和时间上分割多媒体内容以提取对象来生成多媒体内容的表示的方法。 特征提取被应用于对象以产生语义和句法属性,关系以及内容实体的包含集合。 内容实体被编码以产生内容实体的有向非循环图,其中每个有向无环图表示多媒体内容的特定解释。 测量每个内容实体的属性,并将测量的属性分配给有向非循环图中的每个对应的内容实体以对多媒体内容排序。

    Unsupervised learning of video structures in videos using hierarchical statistical models to detect events
    8.
    发明授权
    Unsupervised learning of video structures in videos using hierarchical statistical models to detect events 失效
    使用分层统计模型检测事件的视频中视频结构的无监督学习

    公开(公告)号:US07313269B2

    公开(公告)日:2007-12-25

    申请号:US10734451

    申请日:2003-12-12

    IPC分类号: G06K9/62

    摘要: A method learns a structure of a video, in an unsupervised setting, to detect events in the video consistent with the structure. Sets of features are selected from the video. Based on the selected features, a hierarchical statistical model is updated, and an information gain of the hierarchical statistical model is evaluated. Redundant features are then filtered, and the hierarchical statistical model is updated, based on the filtered features. A Bayesian information criteria is applied to each model and feature set pair, which can then be rank ordered according to the criteria to detect the events in the video.

    摘要翻译: 一种方法在无监督的设置中学习视频的结构,以检测符合结构的视频中的事件。 从视频中选择功能集。 基于所选择的特征,更新层次统计模型,并评估分层统计模型的信息增益。 然后过滤冗余特征,并基于过滤的特征更新分层统计模型。 贝叶斯信息标准适用于每个模型和特征集对,然后可以根据标准对秩进行排序以检测视频中的事件。

    System for tracking railcars in a railroad environment
    9.
    发明申请
    System for tracking railcars in a railroad environment 失效
    铁路环境中轨道车辆追踪系统

    公开(公告)号:US20070146159A1

    公开(公告)日:2007-06-28

    申请号:US11316034

    申请日:2005-12-22

    IPC分类号: H04Q5/22 G08B5/22

    摘要: A system determines real-time locations of railcars in a railroad environment. Railcars are equipped with at least four RFID tags. A RFID reader at a fixed location at every track branch in the environment reads the RFID tags. Railcar locations are updated for the railcars by determining the branches on which the railcars are located.

    摘要翻译: 系统确定铁路车辆在铁路环境中的实时位置。 轨道车配备至少四个RFID标签。 在环境中每个轨道分支处的固定位置的RFID读取器读取RFID标签。 通过确定轨道车辆所在的分支机构,可以更新轨道车辆的轨道车辆位置。

    Method for segmenting 3D objects from compressed videos
    10.
    发明授权
    Method for segmenting 3D objects from compressed videos 失效
    从压缩视频分割3D对象的方法

    公开(公告)号:US07142602B2

    公开(公告)日:2006-11-28

    申请号:US10442417

    申请日:2003-05-21

    IPC分类号: H04B1/66

    摘要: A method segments a video into objects, without user assistance. An MPEG compressed video is converted to a structure called a pseudo spatial/temporal data using DCT coefficients and motion vectors. The compressed video is first parsed and the pseudo spatial/temporal data are formed. Seeds macro-blocks are identified using, e.g., the DCT coefficients and changes in the motion vector of macro-blocks.A video volume is “grown” around each seed macro-block using the DCT coefficients and motion distance criteria. Self-descriptors are assigned to the volume, and mutual descriptors are assigned to pairs of similar volumes. These descriptors capture motion and spatial information of the volumes. Similarity scores are determined for each possible pair-wise combination of volumes. The pair of volumes that gives the largest score is combined iteratively. In the combining stage, volumes are classified and represented in a multi-resolution coarse-to-fine hierarchy of video objects.

    摘要翻译: 一种方法是将视频分割成对象,而无需用户帮助。 使用DCT系数和运动矢量将MPEG压缩视频转换成称为伪空间/时间数据的结构。 首先解压缩视频并形成伪空间/时间数据。 使用例如DCT系数和宏块的运动矢量的变化来识别种子宏块。 使用DCT系数和运动距离标准,在每个种子宏块周围“生长”视频量。 自描述符被分配给卷,并且相互描述符被分配给相似卷的对。 这些描述符捕获卷的运动和空间信息。 确定每个可能的成对组合的相似度分数。 给出最大分数的一对卷被迭代地组合。 在组合阶段,卷被分类并以视频对象的多分辨率粗到精细层级来表示。