-
公开(公告)号:US12045279B2
公开(公告)日:2024-07-23
申请号:US17538880
申请日:2021-11-30
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li , Adit Krishnan , Amit Srivastava , Han Hu , Qi Dai , Yixuan Wei , Yue Cao
CPC classification number: G06F16/5866 , G06F16/51 , G06F16/56 , G06N20/00
Abstract: A system and method and for retrieving one or more visual assets includes receiving a search query for the one or more visual assets, the search query including textual data, encoding the textual data into one or more text embedding representations via a trained text representation machine-learning (ML) model, transmitting the one or more text embedding representations to a matching and selection unit, providing visual embedding representations of one or more visual assets to the matching and selection unit, comparing, by the matching and selection unit, the one or more text embedding representations to the visual embedding representations to identify one or more visual asset search results, and providing the one or more visual asset search results for display.
-
公开(公告)号:US11961261B2
公开(公告)日:2024-04-16
申请号:US17514836
申请日:2021-10-29
Applicant: Microsoft Technology Licensing, LLC
CPC classification number: G06T7/97 , G06N20/20 , G06T3/40 , G06T7/11 , G06T2207/20081 , G06T2207/20132
Abstract: A scheme for modifying an image is disclosed, which includes receiving a source image having a first image configuration; determining a second image configuration for a target image; providing the received source image to an AI engine trained to identify, based on a set of rules related to visual features, candidate regions from the source image; generating proposal images based on the candidate regions, respectively; determining, based on prior aesthetical evaluation data, an aesthetical value of each regional proposal image; and selecting, based on the determined aesthetical value of each regional proposal image, one of the regional proposal images as the target image; extracting, from the AI engine, the target image; and causing the target image to be displayed via a display of a user device.
-
公开(公告)号:US11935154B2
公开(公告)日:2024-03-19
申请号:US17684889
申请日:2022-03-02
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li , Fatima Zohra Daha , Bei Liu , Huan Yang , Jianlong Fu
IPC: G06T7/00 , G06F3/0482 , G06T11/00
CPC classification number: G06T11/00 , G06F3/0482 , G06T7/0002 , G06T2200/24 , G06T2207/20081 , G06T2207/30168
Abstract: A method and system for transforming an input image via a plurality of image transformation stylizers includes receiving the input image; providing the input image, information about the plurality of image transformation stylizers and at least one of user data, history data, and contextual data to a trained machine-learning (ML) model for selecting a subset of the plurality of image transformation stylizers; receiving as an output from the ML model the subset of image transformation stylizers; executing the subset of the image transformation stylizers on the input image to generate a plurality of transformed output images; ranking the plurality of transformed output images based on at least one of the input image, the user data, the history data, and the contextual data; and providing the ranked plurality of transformed output images for display.
-
公开(公告)号:US11909922B2
公开(公告)日:2024-02-20
申请号:US18155918
申请日:2023-01-18
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li , Amit Srivastava , Derek Martin Johnson , Priyanka Vikram Sinha , Konstantin Seleskerov , Gencheng Wu
IPC: H04M3/56 , G06N3/08 , H04L12/18 , H04L65/401 , H04L65/403
CPC classification number: H04M3/568 , G06N3/08 , H04L12/1818 , H04L12/1822 , H04L65/403 , H04L65/4015
Abstract: The present disclosure relates to processing operations configured to provide processing that automatically analyzes acoustic signals from attendees of a live presentation and automatically triggers corresponding reaction indications from results of analysis thereof. Exemplary reaction indications provide feedback for live presentations that can be presented in real-time (or near real-time) without requiring a user to manually take action to provide any feedback. As a non-limiting example, reaction indications may be presented in a form that is easy to visualize and understand such as emojis or icons. Another example of a reaction indication is a graphical user interface (GUI) notification that provides a predictive indication of user intent derived from analysis of acoustic signals. Further examples described herein extend to training and application of artificial intelligence (AI) processing, in real-time (or near real-time), that is configured to automatically analyze acoustic features of audio streams and automatically generate exemplary reaction indications.
-
公开(公告)号:US11841911B2
公开(公告)日:2023-12-12
申请号:US17530982
申请日:2021-11-19
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li , Amit Srivastava , Adit Krishnan , Aman Malik
IPC: G06F17/00 , G06F16/953 , G06N20/00
CPC classification number: G06F16/953 , G06N20/00
Abstract: A data processing system implements receiving query text for a search query for textual content recommendation. The query text includes one or more words indicating a type of textual content items being sought. The system implements analyzing the query text using a first machine learning (ML) model to obtain encoded query text, where the first ML model is trained to identify features within the query text and to generate the encoded query text by mapping the features to a hyper-dimensional latent space (HDLS). The system implements identifying one or more content items in a database of encoded content items mapped to the HDLS that satisfy the search query by comparing attributes of the encoded query text with attributes of the encoded content items to identify content items that are closest to the encoded query text within the HDLS, and causing the one or more content items to be displayed.
-
公开(公告)号:US11556183B1
公开(公告)日:2023-01-17
申请号:US17490677
申请日:2021-09-30
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li , Mingxi Cheng , Fatima Zohra Daha , Amit Srivastava
IPC: G06F3/0482 , G06F3/01 , G06N3/04 , G06V40/20 , G06V40/10
Abstract: A method and system for generating training data for training a gesture detection machine-learning (ML) model includes receiving a request to generate training data for the gesture detection model, the training data being associated with a target gesture, retrieving data associated with an original gesture, the original gesture being a gesture made using a body part, retrieving skeleton data associated with the target gesture, the skeleton data displaying a skeleton representative of the body part and the skeleton displaying the target gesture, aligning a location of the body part in the data with a location of the skeleton in the skeleton data, providing the aligned data and the skeleton data to an ML model for generating a target data that displays the target gesture, receiving the target data as an output from the ML model, the target data preserving a visual feature of the data and displaying the target gesture, and providing the target data to the gesture detection ML model.
-
公开(公告)号:US11270059B2
公开(公告)日:2022-03-08
申请号:US16552210
申请日:2019-08-27
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li , Xiaozhi Yu , Gregory Alexander DePaul , Youjun Liu , Amit Srivastava
IPC: G06F40/106 , G06N20/00
Abstract: A textual user input is received and a plurality of different text-to-content models are run on the textual user input. A selection system attempts to identify a suggested content item, based upon the outputs of the text-to-content models. The selection system first attempts to generate a completed suggestion based on outputs from a single text-to-content model. It then attempts to mix the outputs of the text-to-content models to obtain a completed content suggestion.
-
公开(公告)号:US12032922B2
公开(公告)日:2024-07-09
申请号:US17318170
申请日:2021-05-12
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li , Konstantin Seleskerov , Huey-Ru Tsai , Muin Barkatali Momin , Ramya Tridandapani , Sindhu Vigasini Jambunathan , Amit Srivastava , Derek Martin Johnson , Gencheng Wu , Sheng Zhao , Xinfeng Chen , Bohan Li
IPC: G06F40/58 , G06F3/0481 , G06F16/2457 , G06F40/205 , G10L13/02
CPC classification number: G06F40/58 , G06F3/0481 , G06F16/24578 , G06F40/205 , G10L13/02
Abstract: Automatic generation of intelligent content is created using a system of computers including a user device and a cloud-based component that processes the user information. The system performs a process that includes receiving an input document and parsing the input document to generate inputs for a natural language generation model using a text analysis model. The natural language generation model generates one or more candidate presentation scripts based on the inputs. A presentation script is selected from the candidate presentation scripts and displayed. A text-to-speech model may be used to generate a synthesized audio presentation of the presentation script. A final presentation may be generated that includes a visual display of the input document and the corresponding audio presentation in sync with the visual display.
-
9.
公开(公告)号:US11900052B2
公开(公告)日:2024-02-13
申请号:US17095603
申请日:2020-11-11
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li , Amit Srivastava , Mingxi Cheng
IPC: G06N3/02 , G06N3/088 , G06N5/04 , G06F40/186 , G06N20/00 , G06F40/103 , G06V10/40
CPC classification number: G06F40/186 , G06F40/103 , G06N3/02 , G06N3/088 , G06N5/04 , G06N20/00 , G06V10/40
Abstract: The present disclosure applies trained artificial intelligence (AI) processing adapted to automatically generating transformations of formatted templates. Pre-existing formatted templates (e.g., slide-based presentation templates) are leveraged by the trained AI processing to automatically generate a plurality of high-quality template transformations. In transforming a formatted template, the trained AI processing not only generates feature transformation of objects thereof but may also provide style transformations where attributes associated with a presentation theme may be modified for a formatted template or set of formatted templates. The trained AI processing is novel in that it is tailored for analysis of feature data of a specific type of formatted template. The trained AI processing converts a formatted template into a feature vector and utilizes conditioned generative modeling to generate one or more transformed templates using a representation of the feature data and feature data from one or more other formatted templates.
-
公开(公告)号:US11507677B2
公开(公告)日:2022-11-22
申请号:US16276908
申请日:2019-02-15
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li , Youjun Liu , Amit Srivastava
Abstract: The present disclosure relates to processing operations that execute image classification training for domain-specific traffic, where training operations are entirely compliant with data privacy regulations and policies. Image classification model training, as described herein, is configured to classify meaningful image categories in domain-specific scenarios where there is unknown data traffic and strict data compliance requirements that result in privacy-limited image data sets. Iterative image classification training satisfies data compliance requirements through a combination of online image classification training and offline image classification training. This results in tuned image recognition classifiers that have improved accuracy and efficiency over general image recognition classifiers when working with domain-specific data traffic. One or more image recognition classifiers are independently trained and tuned to detect an image class for image classification. Training of independent image recognition classifiers is also utilized for training and tuning of deeper learning models for image classification.
-
-
-
-
-
-
-
-
-