I am thinking about buying a new GPU, primarily for training different pyro models.
For standard Deep Learning there is a blogpost about which graphic card to get: https://timdettmers.com/2019/04/03/which-gpu-for-deep-learning/.
Under ‘What Makes One GPU Faster Than Another?’, the author states that for Deep Learning we should prioritize according to the networks we are going to use:
- Convolutional networks and Transformers: Tensor Cores; FLOPs; Memory Bandwidth; 16-bit capability
- Recurrent networks: Memory Bandwidth; 16-bit capability; Tensor Cores; FLOPs
Does there exist a general rule/recommandation for pyro? What is the bottleneck for the message passing and for training algorithms using MCMC or SVI?
In which cases is memory a limiting factor?
I’ve realized that most of the examples don’t consume too much memory except for gaussian processes/ gpytorch.