Running Model on both GPUs and CPUs

Question

I have access to a hpc node, of 3 GPU and maximum of 38 CPU. I have a transformer model which I run of a single GPU at the moment, I want to utilize all the GPUs and CPUs. I have seen couple of tutorial on Dataparrallel and DistributedDataParallel. They only mentioned how to use multiple GPUs.

My questions are:

Do I use Dataparallel or DistributedDataParallel
How do I adapt my code run on the GPUs and CPUs simultaneously. Perhaps if I can get a tutorial link.
How to do I get the device ids.

score 0 · Accepted Answer · edited Sep 02 '22 at 04:16

I did use DistributedDataParallel according to PyTorch documentation DataParallel is usually slower than DistributedDataParallel therefore it is recommended since DistributedDataParallel works for both single- and multi-machine training.
Tutorial Comparison between DataParallel and DistributedDataParallel
Another Tutorial Multi-GPU Examples
Solution LITDataScience's answer - How to find the nvidia GPU IDs for pytorch cuda run setup?

Running Model on both GPUs and CPUs

1 Answers1