kill $(ps aux | grep XXX.py | grep -v grep | awk '{print $2}')
https://leimao.github.io/blog/PyTorch-Distributed-Training