Patent search ap:("Microsoft Technology Licensing Page LLC") AND inv:"Amar PHANISHAYEE"

1.

发明公开
Workload-Aware Hardware Architecture Recommendations 审中-公开

公开(公告)号：US20240126611A1

公开(公告)日：2024-04-18

申请号：US17965681

申请日：2022-10-13

Applicant: Microsoft Technology Licensing, LLC

Inventor： Amar PHANISHAYEE , Divya MAHAJAN , Janardhan KULKARNI , Miguel CASTRO , Muhammad ADNAN

IPC: G06F9/50 , G06F9/48 , G06N3/08

CPC classification number: G06F9/5044 , G06F9/4881 , G06F9/505 , G06N3/08

Abstract: The description relates to accelerator architectures for deep learning models. One example can obtain a deep learning training script associated with a deep learning model and extract an operator graph from the training script. The example can split the operator graph into first and second portions of a heterogeneous pipeline and tune a first accelerator core for the first portion of the heterogeneous pipeline and a second accelerator core for the second portion of the heterogeneous pipeline. The example can also generate a hardware architecture that includes the first accelerator core and the second accelerator core arranged to collectively accomplish the deep learning model.

2.

发明申请
TRAINING NEURAL NETWORKS BASED ON DUAL PIPELINE ARCHITECTURES 有权

公开(公告)号：US20220138524A1

公开(公告)日：2022-05-05

申请号：US17151007

申请日：2021-01-15

Applicant: Microsoft Technology Licensing, LLC

Inventor： Mattheus HEDDES , Torsten HOEFLER , Kenneth Andrew COLWELL , Amar PHANISHAYEE

IPC: G06N3/04 , G06N3/08

Abstract: Embodiments of the present disclosure include systems and methods for training neural networks based on dual pipeline architectures. In some embodiments, a first set of compute elements are configured to implement a first set of layers of a first instance of a neural network. A second set of compute elements are configured to implement a second set of layers of the first instance of the neural network. The second set of compute elements are further configured to implement a first set of layers of a second instance of the neural network. The first set of compute elements are further configured to implement a second set of layers of the second instance of the neural network. The first set of layers of the first instance of the neural network and the first set of layers of the second instance of the neural network are each configured to receive training data.

3.

发明申请
INTEGRATED HARDWARE ARCHITECTURE AND DISTRIBUTION STRATEGY OPTIMIZATION FOR DEEP LEARNING MODELS 有权

公开(公告)号：US20250061533A1

公开(公告)日：2025-02-20

申请号：US18452162

申请日：2023-08-18

Applicant: Microsoft Technology Licensing, LLC

Inventor： Amar PHANISHAYEE , Divya MAHAJAN , Jakub Michal TARNAWSKI

IPC: G06T1/20 , G06N3/098

Abstract: A training optimization system implements algorithmic solutions to solve the conjoined problem of accelerator architecture search and model partitioning for distributed training. The system makes the multi-dimensional optimization space of architecture search and device placement tractable by reducing the number of accelerator architectures explored through area-based heuristics and employing a novel integer linear program (ILP), the size of which is dependent only on the number of operators. The ILP scheduling optimization also explores the partitioning of operators across cores, known as intra-operator parallelism. Despite the vast space, the ILP described herein requires significantly less time to perform the optimizations across all explored accelerator configurations. Based on the optimal backward and forward pass latencies, the system leverages a novel dynamic programming (DP) approach to determine the device placement and model partitioning scheme.

4.

发明申请
MITIGATING COMMUNICATION BOTTLENECKS DURING PARAMETER EXCHANGE IN DATA-PARALLEL DNN TRAINING 审中-公开

公开(公告)号：US20200160171A1

公开(公告)日：2020-05-21

申请号：US16276250

申请日：2019-02-14

Applicant: Microsoft Technology Licensing, LLC

Inventor： Nikhil Devanur RANGARAJAN , Jorgen THELIN , Amar PHANISHAYEE , Guanhua WANG , Shivaram VENKATARAMAN

IPC: G06N3/08 , G06F13/42

Abstract: Technologies are disclosed herein for dynamically generating communication primitives for use in model parameter synchronization during data-parallel DNN training by packing directed spanning trees. An interconnect topology for communication between GPUs in a computing system is determined. A quantity of directed spanning trees are generated for transmitting data between the GPUs using the interconnect topology and packed. The directed spanning trees define the connections between GPUs that are to be utilized for the transmission and the amount of data to be transmitted on each connection. Program code is generated for implementing the data transfer defined by the directed spanning trees. When the program code is executed, the directed spanning trees are used to pipeline the transmission of chunks of data, such as model parameters used during data-parallel DNN training, between the GPUs. The program code can also determine an optimal chunk size for data to be transferred between the GPUs.

5.

发明申请
IoT Gateway For Weakly Connected Settings 审中-公开

公开(公告)号：US20190007505A1

公开(公告)日：2019-01-03

申请号：US16103825

申请日：2018-08-14

Applicant: Microsoft Technology Licensing, LLC

Inventor： Ranveer CHANDRA , Ashish KAPOOR , Sudipta SINHA , Amar PHANISHAYEE , Deepak VASISHT , Xinxin JIN , Madhusudhan Gumbalapura SUDARSHAN

IPC: H04L29/08 , H04N7/18 , H04N5/232 , G01C11/02 , H04L12/923 , H04L12/24 , H04L12/66

CPC classification number: H04L67/18 , G01C11/02 , H04L12/66 , H04L41/0896 , H04L47/762 , H04L67/10 , H04L67/12 , H04L67/2828 , H04L67/322 , H04N5/23238 , H04N7/181

Abstract: A gateway that may be implemented in a local network and that communicates with a cloud network to provide efficient services in a weakly connected setting is disclosed. The gateway may be configured to enable services that efficiently utilize resources in both of the gateway and the cloud network, and provide a desired quality of service while operating in a weakly connected setting. The gateway may provide data collection and processing, local network services, and enable cloud services that utilize data collected and processed by the gateway. The local network may include one or more sensors and/or video cameras that provide data to the gateway. In a further implementation, the gateway may determine an allocation of one or more tasks of a service between the gateway and a cloud network by determining the allocation of the one or more service tasks based on desired service latency.

6.

发明申请
AUTOMATIC LATENCY OPTIMIZATION FOR CPU-BASED DNN SERVING 有权

公开(公告)号：US20250060998A1

公开(公告)日：2025-02-20

申请号：US18452326

申请日：2023-08-18

Applicant: Microsoft Technology Licensing, LLC

Inventor： Amar PHANISHAYEE , . Ankit , Deepak NARAYANAN , Mihail Gavril TARTA

IPC: G06F9/50

Abstract: Systems and methods for optimizing thread allocation in a model serving system include estimating a batch size for inference requests. An optimal configuration is then determined that defines a number of inference instances, a number of threads per inference instance, and a sub-batch size per inference instance for processing a batch of inference requests of the batch size using intra-operator parallelism that minimizes average per-batch latency. The optimal configuration is determined with reference to a plurality of predetermined model profiles that define single-inference average batch latencies for different combinations of thread counts and batch sizes, the predetermined model profiles being used as input to a dynamic programming algorithm that identifies optimal configurations that minimize the average per-batch latency.

7.

发明申请
SELECTIVE DATA STRUCTURE ENCODING FOR DEEP NEURAL NETWORK TRAINING 有权

公开(公告)号：US20220414457A1

公开(公告)日：2022-12-29

申请号：US17362751

申请日：2021-06-29

Applicant: Microsoft Technology Licensing, LLC

Inventor： Fanny NINA PARAVECINO , Amar PHANISHAYEE , Atefeh MEHRABI

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, apparatuses, and computer-readable storage mediums described herein are directed to techniques for efficient data encoding for neural network training. In particular, the embodiments described herein train a DNN based on a selective encoding (e.g., compressing) of data structures that are generated during training. For example, multiple training sessions may be performed where, in each training session, a different set of data structures performed by various operators of the DNN are encoded. Memory allocation information generated based on each training session is analyzed to determine which combination of encoded data structures results in a reduction of memory required to train the DNN.

8.

发明申请
BLOCK STORAGE BY DECOUPLING ORDERING FROM DURABILITY 审中-公开

公开(公告)号：US20190087287A1

公开(公告)日：2019-03-21

申请号：US16141269

申请日：2018-09-25

Applicant: Microsoft Technology Licensing, LLC

Inventor： James W. MICKENS , Amar PHANISHAYEE , Vijaychidambaram VELAYUDHAN PILLAI

IPC: G06F11/14 , G06F3/06

CPC classification number: G06F11/1471 , G06F3/061 , G06F3/0619 , G06F3/0647 , G06F3/0656 , G06F3/0659 , G06F3/0683 , G06F3/0689 , G06F11/14 , G06F11/1469 , G06F2201/805 , G06F2201/82 , G06F2201/84

Abstract: This document relates to data storage techniques. One example can buffer write commands and cause the write commands to be committed to storage in flush epoch order. Another example can maintain a persistent log of write commands that are arranged in the persistent log in flush epoch order. Both examples may provide a prefix consistent state in the event of a crash.

9.

发明公开
MACHINE LEARNING MODEL LINEAGE TRACKING 审中-公开

公开(公告)号：US20240273397A1

公开(公告)日：2024-08-15

申请号：US18144565

申请日：2023-05-08

Applicant: Microsoft Technology Licensing, LLC

Inventor： Deepak NARAYANAN , Amar PHANISHAYEE , Daniel Marcos MENDOZA , Wei HAO

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: The present disclosure relates to methods and systems that create a lineage graph that tracks provenance information across machine learning models. The methods and systems use the lineage graph to facilitate machine learning model testing, diagnostics, and updating. The methods and system also use the lineage graph to determine a storage optimization for reducing a storage footprint of the machine learning models.

10.

发明公开
Deep Learning Scheduler Toolkit 审中-公开

公开(公告)号：US20240160471A1

公开(公告)日：2024-05-16

申请号：US17985120

申请日：2022-11-10

Applicant: Microsoft Technology Licensing, LLC

Inventor： Amar PHANISHAYEE , Saurabh AGARWAL

IPC: G06F9/48 , G06F9/50

CPC classification number: G06F9/4881 , G06F9/5077 , G06F2209/501 , G06F2209/505

Abstract: The description relates to deep learning cluster scheduler modular toolkits. One example can include generating a deep learning cluster scheduler modular toolkit that includes multiple DL scheduler abstraction modules and interactions between the multiple DL scheduler abstraction modules and allows user composition of the multiple DL scheduler abstraction modules to realize a deep learning scheduler.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification