Publications | Projects | Teaching | Services | News

 

Shigang Li (李士刚)
Postdoctoral Researcher, SPCL, Department of Computer Science, ETH Zurich

Research interests:   parallel and distributed computing,   parallel and distributed deep learning,   machine learning systems
Emails:   shigangli.cs@gmail.com;   shigang.li@inf.ethz.ch
Links:     [Google Scholar]   [ResearchGate]   [ORCID]

Brief Biography

News

Publications

  • [PPoPP'2020]   Shigang Li, Tal Ben-Nun, Salvatore Di Girolamo, Dan Alistarh, and Torsten Hoefler. Taming unbalanced training workloads in deep learning with partial collective operations. In Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 45-61. 2020. (Acceptance rate: 23%, 28/121; best paper nomination, 5/28) [PDF][Slides]
  • [IPDPS'2020]   Hang Cao, Liang Yuan, He Zhang, Baodong Wu, Shigang Li, Pengqi Lu, Yunquan Zhang, Yongjun Xu, and Minghua Zhang. A Highly Efficient Dynamical Core of Atmospheric General Circulation Model based on Leap-Format. In 2020 IEEE International Parallel and Distributed Processing Symposium, pp. 95-104. IEEE, 2020.
  • [arXiv]   Peter Grönquist, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Luca Lavarini, Shigang Li, and Torsten Hoefler. Predicting weather uncertainty with deep convnets. arXiv preprint arXiv:1911.00630 (2019).
  • [SC'2019]   Kun Li, Honghui Shang, Yunquan Zhang, Shigang Li, Baodong Wu, Dong Wang, Libo Zhang, Fang Li, Dexun Chen, and Zhiqiang Wei. OpenKMC: a KMC design for hundred-billion-atom simulation using millions of cores on Sunway Taihulight. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, p. 68. ACM, 2019. (Acceptance rate: 22.7%, 78/344)
  • [ICTAI'2019]   Daning Cheng, Hanping Zhang, Fen Xia, Shigang Li, and Yunquan Zhang. Using Gradient based multikernel Gaussian Process and Meta-acquisition function to Accelerate SMBO. In 2019 IEEE 31st International Conference on Tools with Artificial Intelligence, pp. 440-447. IEEE, 2019.
  • [JSUPERCOMPUT'2019]   Kun Li, Shigang Li*, Shan Huang, Yifeng Chen, and Yunquan Zhang. FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations. The Journal of Supercomputing (2019): 1-20.
  • [ISPA'2019]   Kun Li, Shigang Li, Bei Wang, Yifeng Chen, and Yunquan Zhang. swMD: Performance Optimizations for Molecular Dynamics Simulation on Sunway Taihulight. In 2019 IEEE International Symposium on Parallel & Distributed Processing with Applications, pp. 511-518. IEEE, 2019.
  • [ICPADS'2018]   Baodong Wu, Shigang Li*, Hang Cao, Yunquan Zhang, He Zhang, Junmin Xiao, and Minghua Zhang. AGCM3D: A Highly Scalable Finite-Difference Dynamical Core of Atmospheric General Circulation Model Based on 3D Decomposition. In 2018 IEEE 24th International Conference on Parallel and Distributed Systems, pp. 355-364. IEEE, 2018. (Corresponding Author) [Slides]
  • [JPDC'2018]   Zhihao Li, Haipeng Jia, Yunquan Zhang, Shice Liu, Shigang Li, Xiao Wang, and Hao Zhang. Efficient parallel optimizations of a high-performance SIFT on GPUs. Journal of Parallel and Distributed Computing 124 (2019): 78-91.
  • [TPDS'2018]   Shigang Li, Yunquan Zhang, and Torsten Hoefler. Cache-oblivious MPI all-to-all communications based on Morton order. IEEE Transactions on Parallel and Distributed Systems 29, no. 3 (2017): 542-555. [PDF][Slides]
  • [ICPP'2018]   Shigang Li, Baodong Wu, Yunquan Zhang, Xianmeng Wang, Jianjiang Li, Changjun Hu, Jue Wang, Yangde Feng, and Ningming Nie. Massively scaling the metal microscopic damage simulation on sunway taihulight supercomputer. In Proceedings of the 47th International Conference on Parallel Processing, pp. 1-11. 2018. [PDF][Slides]
  • [ICPP'2018]   Junmin Xiao, Shigang Li, Baodong Wu, He Zhang, Kun Li, Erlin Yao, Yunquan Zhang, and Guangming Tan. Communication-avoiding for dynamical core of atmospheric general circulation model. In Proceedings of the 47th International Conference on Parallel Processing, pp. 1-10. 2018.
  • [PPoPP'2017]   Shigang Li, Yunquan Zhang, and Torsten Hoefler. Cache-oblivious MPI all-to-all communications on many-core architectures. Poster, ACM SIGPLAN Notices 52, no. 8 (2017): 445-446.
  • [CPC'2017]   Changjun Hu, Xianmeng Wang, Jianjiang Li, Xinfu He, Shigang Li, Yangde Feng, Shaofeng Yang, and He Bai. Kernel optimization for short-range molecular dynamics. Computer Physics Communications 211 (2017): 31-40.
  • [CPC'2017]   Baodong Wu, Shigang Li*, Yunquan Zhang, and Ningming Nie. Hybrid-optimization strategy for the communication of large-scale Kinetic Monte Carlo simulation. Computer Physics Communications 211 (2017): 113-123. (Corresponding Author)
  • [TACO'2016]   Yunquan Zhang, Shigang Li*, Shengen Yan*, and Huiyang Zhou. A cross-platform spmv framework on many-core architectures. ACM Transactions on Architecture and Code Optimization (TACO) 13, no. 4 (2016): 1-25. (Corresponding Author) [PDF]
  • [PIEEE'2016]   Yunquan Zhang, Ting Cao, Shigang Li, Xinhui Tian, Liang Yuan, Haipeng Jia, and Athanasios V. Vasilakos. Parallel processing systems for big data: a survey. Proceedings of the IEEE 104, no. 11 (2016): 2114-2136.
  • [SCI CHINA INFORM SCI'2015]   Shigang Li, ChangJun Hu, JunChao Zhang, and YunQuan Zhang. Automatic tuning of sparse matrix-vector multiplication on multicore clusters. Science China Information Sciences 58, no. 9 (2015): 1-14.
  • [HPCC'2015]   Shigang Li, Yunquan Zhang, Chunyang Xiang, and Lei Shi. Fast convolution operations on many-core architectures. In 2015 IEEE 17th International Conference on High Performance Computing and Communications, pp. 316-323. IEEE, 2015. [Slides]
  • [CCGrid'2015]   Xiaomin Zhu, Junchao Zhang, Kazutomo Yoshii, Shigang Li, Yunquan Zhang, and Pavan Balaji. Analyzing MPI-3.0 process-level shared memory: A case study with stencil computations. In 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Workshop, pp. 1099-1106. IEEE, 2015.
  • [CLUSTER COMPUT'2014]   Shigang Li, Torsten Hoefler, Chungjin Hu, and Marc Snir. Improved MPI collectives for MPI processes in shared address spaces. Cluster computing 17, no. 4 (2014): 1139-1155.
  • [HPDC'2013]   Shigang Li, Torsten Hoefler, and Marc Snir. NUMA-aware shared-memory collective communication for MPI. In Proceedings of the 22nd international symposium on High-performance parallel and distributed computing, pp. 85-96. 2013. (Acceptance rate: 15%, 20/131; best paper nomination, 3/20)
  • [PDP'2013]   Shigang Li, Jingyuan Hu, Xin Cheng, and Chongchong Zhao. Asynchronous work stealing on distributed memory systems. In 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp. 198-202. IEEE, 2013.
  • [ICA3PP'2011]   Shigang Li, Shucai Yao, Haohu He, Lili Sun, Yi Chen, and Yunfeng Peng. Extending synchronization constructs in openMP to exploit pipeline parallelism on heterogeneous multi-core. In International Conference on Algorithms and Architectures for Parallel Processing, Workshop, pp. 54-63. Springer, Berlin, Heidelberg, 2011.
  • [ICCS'2011]   Yunfeng Peng, Changjun Hu, Chongchong Zhao, Shigang Li, and Shucai Yao. Management of Non-functional Attributes of Parallel Components. Procedia Computer Science 4 (2011): 461-470.
  • [ICA3PP'2010]   Qian Cao, Changjun Hu, Haohu He, Xiang Huang, and Shigang Li. Support for OpenMP tasks on cell architecture. In International Conference on Algorithms and Architectures for Parallel Processing, Workshop, pp. 308-317. Springer, Berlin, Heidelberg, 2010.

Projects

  • Project Leader — MPI Model Extension and Performance Optimization for Many-Core Clusters, National Natural Science Foundation of China
  • Project Leader — MPI Communication Optimization for Irregular Parallel Algorithms, State Key Laboratory of Computer Architecture Foundation
  • Technical Principal — High Performance Deep Learning Library Development on CPU and GPU Architectures, IT Company Foundation
  • Technical Principal — Large-Scale Deep Learning Training System on Heterogeneous Parallel Machines, IT Company Foundation

Teaching

Services

  • Associate editor of Cluster Computing (CLUS) - Springer
  • Reviewer of IEEE Transactions on Parallel and Distributed Systems (TPDS)
  • Reviewer of IEEE Transactions on Services Computing (TSC)
  • Reviewer of IEEE Transactions on Big Data (TBD)
  • Reviewer of Journal of Parallel and Distributed Computing (JPDC) - Elsevier
  • Reviewer of Journal of Supercomputing - Springer
  • Reviewer of Concurrency and Computation: Practice and Experience
  • Reviewer of IEEE Transactions on Circuits and Systems II: Express Briefs
  • Reviewer of Mobile Networks and Applications – Springer
  • Publications Chair, IISWC 2020
  • Workshop Co-chair, ICS 2018
  • Program committee Member, IEEE Cluster 2021
  • Program committee Member, ICPADS 2018
  • Program committee Member, IPDPS 2017, 2018
  • Program committee Member, HPC Asia 2018, 2019, 2020, 2021
  • Program committee Member, ICPP 2017
  • Program committee Member, HPC China 2016, 2017, 2018, 2019
  • Program committee Member, SBAC-PAD 2016, 2020
  • Program committee Member, HP3C 2018, 2019, 2020
  • Program committee Member, INFOCOMP 2020