Method and system for scene-aware interaction

    公开(公告)号:US11635299B2

    公开(公告)日:2023-04-25

    申请号:US16784103

    申请日:2020-02-06

    Abstract: A navigation system for providing driving instructions to a driver of a vehicle traveling on a route is provided. The driving instructions are generated by executing a multimodal fusion method that comprises extracting features from sensor measurements, annotating the features with directions for the vehicle to follow the route with respect to objects sensed by the sensors, and encoding the annotated features with a multimodal attention neural network to produce encodings. The encodings are transformed into a common latent space, and the transformed encodings are fused using an attention mechanism producing an encoded representation of the scene. The method further comprises decoding the encoded representation with a sentence generation neural network to generate a driving instruction and submitting the driving instruction to an output device.

    System and method for Inter-Frame Predictive Compression for Point Clouds

    公开(公告)号:US20190116357A1

    公开(公告)日:2019-04-18

    申请号:US15876522

    申请日:2018-01-22

    Abstract: A point cloud encoder including an input interface to accept a dynamic point cloud including a sequence of point cloud frames of a scene. A processor encodes blocks of a current point cloud frame to produce an encoded frame. Wherein, for encoding a current block of the current point cloud frame, a reference block is selected similar to the current block according to a similarity metric to serve as a reference to encode the current block. Pair each point in the current block to a point in the reference block based on values of the paired points. Encode the current block based on a combination of an identification of the reference block and residuals between the values of the paired points. Wherein the residuals are ordered according to an order of the values of the points in the reference block. A transmitter transmits the encoded frame over a communication channel.

    System and Method for Processing Images using Online Tensor Robust Principal Component Analysis
    5.
    发明申请
    System and Method for Processing Images using Online Tensor Robust Principal Component Analysis 审中-公开
    使用在线张量鲁棒主成分分析处理图像的系统和方法

    公开(公告)号:US20170076180A1

    公开(公告)日:2017-03-16

    申请号:US14854634

    申请日:2015-09-15

    CPC classification number: G06K9/40 G06K9/0063 G06K9/6247

    Abstract: A set of input images are acquired sequentially as image tensors. A low-tubal rank tensor and a sparse tensor are initialized using the image tensor, wherein the low-tubal rank tensor is a tensor product of a low-rank spanning tensor basis and corresponding tensor coefficients, and for each image, updating iteratively the image tensor, the tensor coefficients, and the sparse tensor using the image tensor and the low-rank spanning basis from a previous iteration. The spanning tensor basis is updated using the tensor coefficients, the sparse tensor, and the low rank tubal tensor, wherein the low rank tubal tensor represents a set of output images and the sparse tensor representing a set of sparse images.

    Abstract translation: 一组输入图像被顺序地作为图像张量获取。 使用图像张量来初始化低输出等级张量和稀疏张量,其中低输出等级张量是低等级跨度张量和相应的张量系数的张量积,并且对于每个图像,迭代地更新图像 张量,张量系数和稀疏张量,使用图像张量和从先前迭代的低秩跨度基础。 使用张量系数,稀疏张量和低秩输出张量来更新跨度张量,其中低等级输出张量表示一组输出图像,稀疏张量表示稀疏图像集合。

    Method and system for segmenting an image based on motion vanishing points
    6.
    发明授权
    Method and system for segmenting an image based on motion vanishing points 有权
    基于运动消失点分割图像的方法和系统

    公开(公告)号:US09430840B1

    公开(公告)日:2016-08-30

    申请号:US14806778

    申请日:2015-07-23

    Abstract: A method segments an image acquired by a sensor of a scene by first obtaining motion vectors corresponding to the image and generating a motion vanishing point image, wherein each pixel in the motion vanishing point image represents a number of intersections of pairs of motion vectors at the pixel. In the motion vanishing point image, a representation point for each motion vector is generated and distances between the motion vectors are determined based on the representation points. Then, a motion graph is constructed wherein each node represents a motion vector, and each edge represents a weight based on the distance between the nodes. Graph spectral clustering is performed on the motion graph to produce segments of the image.

    Abstract translation: 方法通过首先获得与图像相对应的运动矢量并产生运动消失点图像来分割由场景的传感器获取的图像,其中运动消失点图像中的每个像素表示运动消失点图像中的运动矢量对的数量 像素。 在运动消失点图像中,生成每个运动矢量的表示点,并且基于表示点来确定运动矢量之间的距离。 然后,构造运动图,其中每个节点表示运动矢量,并且每个边缘表示基于节点之间的距离的权重。 在运动图上执行图谱聚类,以产生图像的段。

    Method for improving compression efficiency of distributed source coding using intra-band information
    7.
    发明授权
    Method for improving compression efficiency of distributed source coding using intra-band information 有权
    使用带内信息提高分布式源编码压缩效率的方法

    公开(公告)号:US09307257B2

    公开(公告)日:2016-04-05

    申请号:US14102609

    申请日:2013-12-11

    CPC classification number: H04N19/395 H04N19/34 H04N19/46

    Abstract: In a decoder, a desired image is estimated by first retrieving coding modes from an encoded side information image. For each bitplane in the encoded side information image, syndrome bits or parity bits are decoded to obtain an estimated bitplane of quantized transform coefficients of the desired image. A quantization and a transform are applied to a prediction residual obtained using the coding modes, wherein the decoding uses the quantized transform coefficients of the encoded side information image, and is based on previously decoded bitplanes in a causal neighborhood. The estimated bitplanes of quantized transform coefficients of the desired image are combined to produce combined bitplanes. Then, an inverse quantization, an inverse transform and a prediction based on the coding modes are applied to the combined bitplanes to recover the estimate of the desired image.

    Abstract translation: 在解码器中,通过首先从编码的侧信息图像中检索编码模式来估计期望的图像。 对于编码侧信息图像中的每个位平面,对奇偶校验位或奇偶校验位进行解码,以获得所需图像的量化变换系数的估计位平面。 量化和变换被应用于使用编码模式获得的预测残差,其中解码使用编码侧信息图像的量化变换系数,并且基于因果邻域中先前解码的位平面。 将所需图像的量化变换系数的估计位平面组合以产生组合位平面。 然后,将基于编码模式的逆量化,逆变换和预测应用于组合的位平面,以恢复所需图像的估计。

    Block Copy Modes for Image and Video Coding
    8.
    发明申请
    Block Copy Modes for Image and Video Coding 审中-公开
    用于图像和视频编码的块复制模式

    公开(公告)号:US20150264383A1

    公开(公告)日:2015-09-17

    申请号:US14212943

    申请日:2014-03-14

    CPC classification number: H04N19/423 H04N19/593

    Abstract: A method decodes blocks in pictures of a video in an encoded bitstream by storing previously decoded blocks in a buffer. The previously decoded blocks are displaced less than a predetermined range relative to a current block being decoded. Cached blocks are maintained in a cache. The cached blocks include a set of best matching previously decoded blocks that are displaced greater than the predetermined range relative to the current block. The bitstream is parsed to obtain a prediction indicator that determines whether the current block is predicted from the previously decoded blocks in the buffer or the cached blocks in the cache. Based on the prediction indicator, a prediction residual block is generated, and in a summation process, the prediction residual block is added to a reconstructed residual block to form a decoded block as output.

    Abstract translation: 一种方法通过将先前解码的块存储在缓冲器中来解码编码比特流中视频的图像中的块。 先前解码的块相对于被解码的当前块移位小于预定范围。 缓存块保存在缓存中。 缓存的块包括相对于当前块移位大于预定范围的一组最佳匹配的先前解码的块。 解析比特流以获得预测指标,该预测指示符根据缓冲器中的先前解码的块或高速缓存中的高速缓存块来确定当前块是否被预测。 基于预测指标,生成预测残差块,在求和处理中,将预测残差块加到重构残差块,形成解码块作为输出。

    Method for Coding Videos and Pictures Using Independent Uniform Prediction Mode
    9.
    发明申请
    Method for Coding Videos and Pictures Using Independent Uniform Prediction Mode 审中-公开
    使用独立统一预测模式编码视频和图片的方法

    公开(公告)号:US20150264345A1

    公开(公告)日:2015-09-17

    申请号:US14207871

    申请日:2014-03-13

    CPC classification number: H04N19/103 H04N19/176 H04N19/186 H04N19/50 H04N19/70

    Abstract: A method for decoding a bitstream, including compressed pictures of a video, wherein each picture includes one or more slices, wherein each slice includes one or more blocks of pixels, and each pixel has a value corresponding to a color, for each slice, first obtains a reduced number of colors corresponding to the slice, wherein each color is represented as a color triplet and the reduced number of colors is less than or equal to a number of colors in the slice. Then, for each block, a prediction mode is determined, wherein an independent uniform prediction mode is included in a candidate set of prediction modes. For each block, a predictor block is generated, wherein all values of the predictor block have a uniform value according to a color index when the prediction mode is set as the independent uniform prediction mode. Lastly, the predictor block is added to a reconstructed residue block to form a decoded block as output.

    Abstract translation: 一种用于解码比特流的方法,包括视频的压缩图像,其中每个图像包括一个或多个切片,其中每个切片包括一个或多个像素块,并且每个像素具有对应于每个切片的颜色的值,第一 获得对应于切片的减少数量的颜色,其中每个颜色被表示为颜色三元组,并且减少的颜色数量小于或等于切片中的颜色数量。 然后,对于每个块,确定预测模式,其中在预测模式的候选组中包括独立的均匀预测模式。 对于每个块,产生预测器块,其中当预测模式被设置为独立的均匀预测模式时,预测块的所有值根据颜色索引具有均匀的值。 最后,将预测器块添加到重建的残差块中以形成解码块作为输出。

    Method for Video Background Subtraction Using Factorized Matrix Completion
    10.
    发明申请
    Method for Video Background Subtraction Using Factorized Matrix Completion 有权
    使用因式矩阵完成的视频背景减法方法

    公开(公告)号:US20150130953A1

    公开(公告)日:2015-05-14

    申请号:US14078804

    申请日:2013-11-13

    CPC classification number: H04N5/23267 G06T7/20 G06T7/254

    Abstract: A method processes a video acquired of a scene by first aligning a group of video images using compressed domain motion information and then solving for a low rank component and a sparse component of the video. A homography map is computed from the motion information to determine image alignment parameters. The video images are then warped using the homography map to share a similar camera perspective. A Newton root step is followed to traverse separately Pareto curves of each low rank component and sparse component. The solving for the low rank component and the sparse component is repeated alternately until a termination condition is reached. Then, the low ranks component and the sparse component are outputted. The low rank component represents a background in the video, and the sparse component represents moving objects in the video.

    Abstract translation: 一种方法通过首先使用压缩域运动信息对齐视频图像组,然后求解视频的低秩分量和稀疏分量来处理场景获取的视频。 根据运动信息计算单应图,以确定图像对准参数。 然后使用双色图映射视频图像以共享相似的相机透视图。 遵循牛顿根步骤分别遍历每个低秩分量和稀疏分量的帕累托曲线。 交替地重复对低等级分量和稀疏分量的求解,直到达到终止条件。 然后,输出低等级分量和稀疏分量。 低等级分量表示视频中的背景,稀疏分量表示视频中的移动对象。

Patent Agency Ranking