-
公开(公告)号:US20220343461A1
公开(公告)日:2022-10-27
申请号:US17240396
申请日:2021-04-26
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
Inventor: Ji LI , Huan YANG , Jianlong FU
Abstract: A system and method for rich content transformation are provided. The system and method allow rich content transformation to be separately processed on a client device and on a cloud-based server. The client device downsizes a rich content and transmits the downsized rich content to the cloud-based server via a network. The cloud-based server calculates function parameters based on the downsized rich content using one or more machine learning models included in the server. The calculated function parameters are transmitted to the client device via the network. The client device then applies these function parameters to the rich content on the client device to obtain the transformed rich content.
-
公开(公告)号:US20200160124A1
公开(公告)日:2020-05-21
申请号:US16631923
申请日:2018-05-29
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
Inventor: Jianlong FU , Tao MEI
Abstract: In accordance with implementations of the subject matter described herein, a solution for fine-grained image recognition is proposed. This solution includes extracting a global feature of an image using a first sub-network of a first learning network; determining a first attention region of the image based on the global feature using a second sub-network of the first learning network, the first attention region including a discriminative portion of an object in the image; extracting a first local feature of the first attention region using a first sub-network of a second learning network; and determining a category of the object in the image based at least in part on the first local feature. Through this solution, it is possible to localize an image region at a finer scale accurately such that a local feature at a fine scale can be obtained for object recognition.
-
公开(公告)号:US20240185602A1
公开(公告)日:2024-06-06
申请号:US18278356
申请日:2022-02-24
Applicant: Microsoft Technology Licensing, LLC
Inventor: Bei LIU , Jianlong FU
Abstract: According to implementations of the present disclosure, a solution for cross-modal processing is provided. In this solution, a set of visual features of a training image is extracted according to a visual feature extraction sub-model in a target model. Each visual feature is corresponding to a pixel block in the training image. A set of visual semantic features corresponding to the set of visual features is determined based on a visual semantic dictionary. A set of text features of a training text corresponding to the training image is extracted according to a text feature extraction sub-model in the target model. Each text feature is corresponding to at least one word in the training text. The target model is trained based on the set of visual semantic features and the set of text features to determine association information between an input text and an input image.
-
公开(公告)号:US20230281884A1
公开(公告)日:2023-09-07
申请号:US17684889
申请日:2022-03-02
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji LI , Fatima Zohra DAHA , Bei LIU , Huan YANG , Jianlong FU
IPC: G06T11/00 , G06F3/0482 , G06T7/00
CPC classification number: G06T11/00 , G06F3/0482 , G06T7/0002 , G06T2200/24 , G06T2207/30168 , G06T2207/20081
Abstract: A method and system for transforming an input image via a plurality of image transformation stylizers includes receiving the input image; providing the input image, information about the plurality of image transformation stylizers and at least one of user data, history data, and contextual data to a trained machine-learning (ML) model for selecting a subset of the plurality of image transformation stylizers; receiving as an output from the ML model the subset of image transformation stylizers; executing the subset of the image transformation stylizers on the input image to generate a plurality of transformed output images; ranking the plurality of transformed output images based on at least one of the input image, the user data, the history data, and the contextual data; and providing the ranked plurality of transformed output images for display.
-
公开(公告)号:US20230177643A1
公开(公告)日:2023-06-08
申请号:US17919723
申请日:2021-04-20
Applicant: Microsoft Technology Licensing, LLC
Inventor: Huan Yang , Jianlong FU , Baining GUO
CPC classification number: G06T3/4053 , G06T7/40 , G06T3/4046 , G06T2207/20081 , G06T2207/20084 , G06T2207/20021
Abstract: There is provided a solution for image processing. In this solution, first and second information is determined based on texture features of an input image and a reference image. The first information at least indicates for a first pixel block in the input image a second pixel block in the reference image most relevant to the first pixel block in terms of the texture features, and the second information at least indicates a relevance of the first pixel block to the second pixel block. A transferred feature map with a target resolution is determined based on the first information and the reference image. The input image is transformed into an output image with the target resolution based on the transferred feature map and the second information. The output image reflects a texture feature of the reference image.
-
-
-
-