DAIG | Notion

Brief Introduction: DAIG(Distributed AI Grid) is distributed artificial intelligence training system which can run on various devices with different performance or resource. Goal of DAIG is to accelerate training speed of deep learning models by sharing of computing resources such as GPU or CPU via network(even they are not in same subnet or physical place) voluntarily or maybe for some financial interests. Research showed that DAIG can reduce training time efficiently while maintaining training result with small gap. However, using DAIG with machines which show severe differences in performance(VRAM, CUDA cores, etc.) couldn’t make remarkable result due to network overhead or aggregation strategy. More details are explained at paper and github link.
Main Role:
- Research and develop distributed learning system.
- Development of back-end server.
- Development of front-end client.
Related Links:
- Github link(Front-end software): https://github.com/ASWCS-Life/DAIG_front
- Github link(Back-end server): https://github.com/ASWCS-Life/DAIG_back