Minjie Wang's Homepage

About Me

I have joined Amazon AI Lab Shanghai as an applied scientist. I will continue work on

Building scalable distributed systems for machine learning algorithms.
Researching new machine learning models for enormous graph data.
Engaging with real customers with machine learning solutions.
Maintaining open source projects and communitites.

I am actively hiring interns, so if you are interested, please email your resume to me.

I obtained my Computer Science Ph.D. degree at NYU in 2020, advised by Professor Jinyang Li. My research interest is the edge area of machine learning and system, including system design for large-scale machine learning, or applying machine learning techniques to system challenges. Before joining NYU, I got my M.S. and B.S. degrees from Shanghai Jiao Tong University, advised by Prof. Minyi Guo. During my master study, I spent two years in Microsoft Research Asia, advised by Zheng Zhang, where I began my journey in machine learning system area. I love contributing to open source projects and co-found DMLC -- a lovely community of brilliant open source hackers.

Projects

Deep Graph Library

A library for deep learning on graphs

New

Graph Neural Networks (GNNs) are recent breakthroughs in the field of machine learning on graphs and have achieved significant improvement in many conventional graph related problems, like social networks, chemical molecules and recommender systems. Deep Graph Library (DGL) is a new package specialized for deep learning on graphs, built atop of current deep learning frameworks (e.g. Pytorch/MXNet). For more details, please visit:

Please also follow our twitter account for more updates.

Auto-parallelization of deep neural network training

Training deep neural networks can easily take days or weeks even with the latest GPUs. To accelerate DNN training, this line of work tries to partition the workload to multiple devices automatically. We design a system called Tofu that utilizes high-level description language and heuristic search algorithm to support training very large DNN models on multiple devices. See our paper for more details (accepted by EuroSys 2019):

Supporting Very Large Models using Automatic Dataflow Graph Partitioning

Apache MXNet

MXNet is the first serious project in DMLC community. It is a tensor-based framework for deep learning models, combining many earlier efforts such as my Minerva project (the M), CXXNet (the X), and ps-lite. As one of the project co-founders, I led the team on dataflow engine. MXNet is Amazon's deep learning framework of choice. It is also accepted as the inclubating project by Apache Software Foundation. For more details, please:

Visit the project website: https://mxnet.apache.org
Read our paper (accepted by NIPS learningSys workshop 2015)

MinPy: A numpy interface with mixed backend

In 2016, we started to build MinPy -- a deep learning framework with numpy-compatible interface, GPU support and dynamic auto-differentiation. MinPy stands for "Mixed NumPy", because the system will automatically choose between numpy and MXNet operators for best efficiency. MinPy is inivited to NVIDIA GTC 2016 and 2017 in the instructor-led tutorial and poster session.

Github repository: https://github.com/dmlc/minpy

Minerva: A scalable and efficient training platform for deep learning

Minerva is my very first attempt in building deep learning system. This is done during my internship in Microsoft Research Asia, advised by Zheng Zhang. It was mentioned in the keynote talk in NVIDIA CES 2016, as an example of "the engine of modern AI".

Github repository: https://github.com/dmlc/minerva
Read our paper (accepted by NIPS'14 workshop).

Publications

Supporting Very Large Models using Automatic Dataflow Graph Partitioning
Minjie Wang, Chien-chin Huang, Jinyang Li. EuroSys'19

Unifying Data, Model and Hybrid Parallelism in Deep Learning via Tensor Tiling
Minjie Wang, Chien-chin Huang, Jinyang Li. arxiv

MXNet: A flexible and efficient machine learning library for heterogenous distributed systems
Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, Zheng Zhang. NIPS'15, LearningSys Workshop (authors are in alphabetical order)

Minerva: A scalable and highly efficient training platform for deep learning
Minjie Wang, Tianjun Xiao, Jianpeng Li, Jiaxing Zhang, Chuntao Hong, Zheng Zhang. NIPS'14 workshop, Distributed Machine Learning and Matrix Computations

A scalable and topology configurable protocol for distributed parameter synchronization
Minjie Wang, Hucheng Zhou, Minyi Guo, Zheng Zhang. ApSys'14

Impression Store: Compressive Sensing-based Storage for Big Data Analytics.
Jiaxing Zhang, Ying Yan, Liang Jeff Chen, Minjie Wang, Thomas Moscibroda, Zheng Zhang HotCloud'14

Work Experience

Applied Scientist Intern - Amazon (2018 - Present)

Cooperate with amazon team on the DGL project.

Software Development Intern - Google (self-driving car) 2016 summer

Improve pedestrian/vehicle detection problem for self-driving car using Faster-RCNN. Develop a new Region Proposal Network for predicting vehicle bounding boxes with heading.

Research Intern - Microsoft Research Asia (2012 - 2014)

Study on distributed systems and build my first deep learning system -- Minerva.

Awards

Dean's Dissertation Fellowship (2018)

Up to two recipients for each department per year.

Jacob T. Schwartz Ph.D. Fellowship (2017)

Awarded for outstanding performance of NYU Ph.D. student. Curious about who is Jacob Schwartz? here is the link. He was the founder of the NYU CS Department.

NVIDIA Graduate Fellowship (2016)

Awarded for my contribution to open source deep learning frameworks. Find out more

Minjie Wang

Applied Scientist, Amazon AI Lab, Shanghai