-
公开(公告)号:US20230401831A1
公开(公告)日:2023-12-14
申请号:US17837636
申请日:2022-06-10
Applicant: Microsoft Technology Licensing, LLC
Inventor: Adit KRISHNAN , Ji Li , Yixuan WEI , Xiaozhi YU , Han HU , Qi DAI
IPC: G06V10/776 , G06V10/774 , G06V10/82 , G06N3/04
CPC classification number: G06V10/776 , G06V10/7747 , G06V10/82 , G06N3/0454
Abstract: A data processing system implements a dynamic knowledge distillation process including dividing training data into a plurality of batches of samples and distilling a student model from a teacher model using an iterative knowledge distillation. The process includes instantiating an instance of the teacher model and the student model in a memory of the data processing system and obtaining a respective batch of training data from the plurality of batches of samples in the memory. The process includes training the teacher and student models using each of the samples in the respective batch of the training data, evaluating the performance of the student model compared with the performance of the teacher model, and providing feedback to student model to adjust the behavior of the student model based on the performance of the student model.
-
公开(公告)号:US20230161825A1
公开(公告)日:2023-05-25
申请号:US17530982
申请日:2021-11-19
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li , Amit Srivastava , Adit KRISHNAN , Aman MALIK
IPC: G06F16/953 , G06N20/00
CPC classification number: G06F16/953 , G06N20/00
Abstract: A data processing system implements receiving query text for a search query for textual content recommendation. The query text includes one or more words indicating a type of textual content items being sought. The system implements analyzing the query text using a first machine learning (ML) model to obtain encoded query text, where the first ML model is trained to identify features within the query text and to generate the encoded query text by mapping the features to a hyper-dimensional latent space (HDLS). The system implements identifying one or more content items in a database of encoded content items mapped to the HDLS that satisfy the search query by comparing attributes of the encoded query text with attributes of the encoded content items to identify content items that are closest to the encoded query text within the HDLS, and causing the one or more content items to be displayed.
-
公开(公告)号:US20220415366A1
公开(公告)日:2022-12-29
申请号:US17868461
申请日:2022-07-19
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li
IPC: G11B27/10 , G06F40/134 , G06N20/00 , G10L25/57 , G10L15/22 , G06K9/62 , G06V20/40 , G10L15/26 , G06F16/43 , G06F3/0482 , G06F40/279 , G06V30/413
Abstract: Systems and methods for providing summarization, indexing, and post-processing of a recorded document presentation are provided. The system accesses a structured document and recordings associated with a recorded presentation given using the structured document. The system analyzes, using machine-trained models, the structured document, audio and video recordings, and recording of operations performed during the presentation. The analyzing comprises generating a transcript of the audio recording, determining context of components of the structured document, and deriving context from the video recordings and recording of operations. Based on the analyzing, the system segments the recorded presentation into a plurality of segments and generates an index of the plurality of segments that is used for post-processing.
-
公开(公告)号:US11423207B1
公开(公告)日:2022-08-23
申请号:US17355673
申请日:2021-06-23
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li
IPC: G06F40/114 , G06F40/151 , G06F40/186 , G06F40/106 , G06N20/00 , G06K9/62 , G06V10/25 , G06V10/46 , G06V10/75 , G06V30/413 , G06V30/422
Abstract: Systems and methods for providing a machine learning-powered framework to transform overloaded text documents is provided. The system generates a plurality of candidate templates offline. During runtime, the system accesses a text document and analyzes the text document to identify segmentation data. The segmentation data can indicate a plurality of segments derived from the text document. The system then accesses a plurality of candidate templates, whereby each candidate template comprises a plurality of pages having a different background element that shares a common theme. The plurality of candidate templates are ranked based on at least the segmentation data. The network then generates multiple presentation pages for each of a predetermined number of top ranked candidate templates by incorporating each of the plurality of segments into a corresponding page of the plurality of pages for each of the top ranked candidate templates. The multiple presentation pages are presented for each of the top ranked candidate templates as a recommendation.
-
公开(公告)号:US20200265153A1
公开(公告)日:2020-08-20
申请号:US16276908
申请日:2019-02-15
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ji Li , Youjun Liu , Amit Srivastava
Abstract: The present disclosure relates to processing operations that execute image classification training for domain-specific traffic, where training operations are entirely compliant with data privacy regulations and policies. Image classification model training, as described herein, is configured to classify meaningful image categories in domain-specific scenarios where there is unknown data traffic and strict data compliance requirements that result in privacy-limited image data sets. Iterative image classification training satisfies data compliance requirements through a combination of online image classification training and offline image classification training. This results in tuned image recognition classifiers that have improved accuracy and efficiency over general image recognition classifiers when working with domain-specific data traffic. One or more image recognition classifiers are independently trained and tuned to detect an image class for image classification. Training of independent image recognition classifiers is also utilized for training and tuning of deeper learning models for image classification.
-
-
-
-