Patent search ap:("Google LLC") AND inv:"Hossein Talebi" Page 1

1.

发明授权
Systems and techniques for retraining models for video quality assessment and for transcoding using the retrained models 有权

公开(公告)号：US12230024B2

公开(公告)日：2025-02-18

申请号：US17762289

申请日：2019-11-26

Applicant: Google LLC

Inventor： Yilin Wang , Hossein Talebi , Peyman Milanfar , Feng Yang , Balineedu Adsumilli

IPC: G06V10/98 , G06N3/045 , G06V10/82 , G06V20/40

Abstract: A trained model is retrained for video quality assessment and used to identify sets of adaptive compression parameters for transcoding user generated video content. Using transfer learning, the model, which is initially trained for image object detection, is retrained for technical content assessment and then again retrained for video quality assessment. The model is then deployed into a transcoding pipeline and used for transcoding an input video stream of user generated content. The transcoding pipeline may be structured in one of several ways. In one example, a secondary pathway for video content analysis using the model is introduced into the pipeline, which does not interfere with the ultimate output of the transcoding should there be a network or other issue. In another example, the model is introduced as a library within the existing pipeline, which would maintain a single pathway, but ultimately is not expected to introduce significant latency.

2.

发明申请
Machine Learning Models Featuring Resolution-Flexible Multi-Axis Attention Blocks 有权

公开(公告)号：US20250069382A1

公开(公告)日：2025-02-27

申请号：US18726881

申请日：2023-01-05

Applicant: Google LLC

Inventor： Yinxiao Li , Zhengzhong Tu , Hossein Talebi , Han Zhang , Feng Yang , Peyman Milanfar

IPC: G06V10/82 , G06V10/764 , G06V10/77

Abstract: Provided are machine learning systems and models featuring resolution-flexible multi-axis attention blocks. In particular, the present disclosure provides example multi-axis MLP based architectures (example implementations of which can be generally referred to as MAXIM) that can serve as an efficient and flexible general-purpose vision backbone for image processing tasks. In some implementations, MAXIM can use a UNet-shaped hierarchical structure and supports long-range interactions enabled by spatially-gated MLPs. Specifically, some example implementations of MAXIM can contain two MLP-based building blocks: a multi-axis gated MLP that allows for efficient and scalable spatial mixing of local and global visual cues, and a cross-gating block, an alternative to cross-attention, which accounts for cross-feature mutual conditioning.

3.

发明授权
Methods, systems, and media for determining perceptual quality indicators of video content items 有权

公开(公告)号：US12206914B2

公开(公告)日：2025-01-21

申请号：US18021636

申请日：2022-06-08

Applicant: Google LLC

Inventor： Yilin Wang , Balineedu Adsumilli , Junjie Ke , Hossein Talebi , Joong Yim , Neil Birkbeck , Peyman Milanfar , Feng Yang

IPC: H04N21/266 , G06N3/045 , H04N17/02 , H04N19/154 , H04N21/234 , H04N21/434 , H04N21/44 , H04N21/466

Abstract: Methods, systems, and media for determining perceptual quality indicators of video content items are provided. In some embodiments, the method comprises: receiving a video content item; extracting a plurality of frames from the video content item; determining, using a first subnetwork of a deep neural network, a content quality indicator for each frame of the plurality of frames of the video content item; determining, using a second subnetwork of the deep neural network, a video distortion indicator for each frame of the plurality of frames of the video content item; determining, using a third subnetwork of the deep neural network, a compression sensitivity indicator for each frame of the plurality of frames of the video content item; generating a quality level for each frame of the plurality of frames of the video content item that concatenates the content quality indicator, the video distortion indicator, and the compression sensitivity indicator for that frame of the video content item; generating an overall quality level for video content item by aggregating the quality level of each frame of the plurality of frames; and causing a video recommendation to be presented based on the overall quality level of the video content item.

4.

发明申请
Multi-Axis Vision Transformer 有权

公开(公告)号：US20250022269A1

公开(公告)日：2025-01-16

申请号：US18902546

申请日：2024-09-30

Applicant: Google LLC

Inventor： Yinxiao Li , Feng Yang , Peyman Milanfar , Han Zhang , Zhengzhong Tu , Hossein Talebi

IPC: G06V10/82 , G06V10/77

Abstract: Provided is an efficient and scalable attention model that can be referred to as multi-axis attention. Example implementations can include two aspects: blocked local and dilated global attention. These design choices allow global-local spatial interactions on arbitrary input resolutions with only linear complexity. The present disclosure also presents a new architectural element by effectively blending the proposed multi-axis attention model with convolutions. In addition, the present disclosure proposes a simple hierarchical vision backbone, example implementations of which can be referred to as MaxViT, by simply repeating the basic building block over multiple stages. Notably, MaxViT is able to “see” globally throughout the entire network, even in earlier, high-resolution stages.

5.

发明申请
Systems and Techniques for Retraining Models for Video Quality Assessment and for Transcoding Using the Retrained Models 有权

公开(公告)号：US20220415039A1

公开(公告)日：2022-12-29

申请号：US17762289

申请日：2019-11-26

Applicant: Google LLC

Inventor： Yilin Wang , Hossein Talebi , Peyman Milanfar , Feng Yang , Balineedu Adsumilli

IPC: G06V10/98 , G06V10/82 , G06V20/40 , G06N3/04

Abstract: A trained model is retrained for video quality assessment and used to identify sets of adaptive compression parameters for transcoding user generated video content. Using transfer learning, the model, which is initially trained for image object detection, is retrained for technical content assessment and then again retrained for video quality assessment. The model is then deployed into a transcoding pipeline and used for transcoding an input video stream of user generated content. The transcoding pipeline may be structured in one of several ways. In one example, a secondary pathway for video content analysis using the model is introduced into the pipeline, which does not interfere with the ultimate output of the transcoding should there be a network or other issue. In another example, the model is introduced as a library within the existing pipeline, which would maintain a single pathway, but ultimately is not expected to introduce significant latency.

6.

发明申请
IMAGE EXTENSION NEURAL NETWORKS 有权

公开(公告)号：US20220148299A1

公开(公告)日：2022-05-12

申请号：US17438687

申请日：2019-07-19

Applicant: Google LLC

Inventor： Mikael Pierre Bonnevie , Aaron Maschinot , Aaron Sarna , Shuchao Bi , Jingbin Wang , Michael Spencer Krainin , Wenchao Tong , Dilip Krishnan , Haifeng Gong , Ce Liu , Hossein Talebi , Raanan Sayag , Piotr Teterwak

IPC: G06V10/82 , G06N3/04 , G06T7/10

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating realistic extensions of images. In one aspect, a method comprises providing an input that comprises a provided image to a generative neural network having a plurality of generative neural network parameters. The generative neural network processes the input in accordance with trained values of the plurality of generative neural network parameters to generate an extended image. The extended image has (i) more rows, more columns, or both than the provided image, and (ii) is predicted to be a realistic extension of the provided image. The generative neural network is trained using an adversarial loss objective function.

7.

发明授权
Image extension neural networks 有权

公开(公告)号：US12236676B2

公开(公告)日：2025-02-25

申请号：US17438687

申请日：2019-07-19

Applicant: Google LLC

Inventor： Mikael Pierre Bonnevie , Aaron Maschinot , Aaron Sarna , Shuchao Bi , Jingbin Wang , Michael Spencer Krainin , Wenchao Tong , Dilip Krishnan , Haifeng Gong , Ce Liu , Hossein Talebi , Raanan Sayag , Piotr Teterwak

IPC: G06K9/00 , G06N3/045 , G06T7/10 , G06V10/82

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating realistic extensions of images. In one aspect, a method comprises providing an input that comprises a provided image to a generative neural network having a plurality of generative neural network parameters. The generative neural network processes the input in accordance with trained values of the plurality of generative neural network parameters to generate an extended image. The extended image has (i) more rows, more columns, or both than the provided image, and (ii) is predicted to be a realistic extension of the provided image. The generative neural network is trained using an adversarial loss objective function.

8.

发明公开
MULTILAYER LAPLACIAN RESIZER FOR COMPUTER VISION SYSTEMS 审中-公开

公开(公告)号：US20240331091A1

公开(公告)日：2024-10-03

申请号：US18621434

申请日：2024-03-29

Applicant: Google LLC

Inventor： Hossein Talebi , Zhengzhong Tu , Peyman Milanfar

IPC: G06T5/20 , G06T5/50

CPC classification number: G06T5/20 , G06T5/50 , G06T2207/20024 , G06T2207/20212

Abstract: The technology provides an image resizer that is jointly trainable with neural network classification (recognition) models, and is designed to improve classification performance. Systems and method include applying an input image to a baseline resizer to obtain a default resized image, and applying the input image to a plurality of filters. Each respective filter in the plurality is configured to perform sub-band filtering on the input image to obtain a sub-band filtered result. This includes applying the sub-band filtered result to the baseline resizer to obtain a respective resized result, and also includes applying to the respective resized result a scaling parameter, a bias parameter, and a nonlinear function to obtain a respective filtered image. The process then combines the default resized image and the respective filtered images to generate a combined resized image.

9.

发明公开
METHODS, SYSTEMS, AND MEDIA FOR DETERMINING PERCEPTUAL QUALITY INDICATORS OF VIDEO CONTENT ITEMS 审中-公开

公开(公告)号：US20230319327A1

公开(公告)日：2023-10-05

申请号：US18021636

申请日：2022-06-08

Applicant: Google LLC

Inventor： Yilin Wang , Balineedu Adsumilli , Junjie Ke , Hossein Talebi , Joong Yim , Neil Birkbeck , Peyman Milanfar , Feng Yang

IPC: H04N21/234 , H04N19/154 , H04N21/466

CPC classification number: H04N21/23418 , H04N19/154 , H04N21/4668

Abstract: Methods, systems, and media for determining perceptual quality indicators of video content items are provided. In some embodiments, the method comprises: receiving a video content item; extracting a plurality of frames from the video content item; determining, using a first subnetwork of a deep neural network, a content quality indicator for each frame of the plurality of frames of the video content item; determining, using a second subnetwork of the deep neural network, a video distortion indicator for each frame of the plurality of frames of the video content item; determining, using a third subnetwork of the deep neural network, a compression sensitivity indicator for each frame of the plurality of frames of the video content item; generating a quality level for each frame of the plurality of frames of the video content item that concatenates the content quality indicator, the video distortion indicator, and the compression sensitivity indicator for that frame of the video content item; generating an overall quality level for video content item by aggregating the quality level of each frame of the plurality of frames; and causing a video recommendation to be presented based on the overall quality level of the video content item.

10.

发明申请
GENERATING QUANTIZATION TABLES FOR IMAGE COMPRESSION 有权

公开(公告)号：US20230130410A1

公开(公告)日：2023-04-27

申请号：US17918170

申请日：2020-04-17

Applicant: Google LLC

Inventor： Xiyang Luo , Feng Yang , Hossein Talebi

IPC: G06T9/00 , G06T3/40

Abstract: Methods, systems, and computer programs encoded on a computer storage medium, that relate to generating quantization tables that are used during digital image compression of a digital image. Multiple training images are obtained. A model can be trained using the training images to generate a quantization table that can be used during encoding of an input image. For each training image, a quantization table can be obtained using the model. Using the quantization table, an encoded digital image is obtained for the training image. Using the encoded digital image and the training image, an image quality loss and a compression loss can be determined. An overall loss of the model can be determined by combining the image quality loss and the compression loss for the training image. The model can be updated based on the overall loss.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification