Hi all!
I’m building a pharmacokinetic model that’s basically a discretized first-order ODE where c_t = K c_{t-1}.
It works great, but I can’t seem to get the program to 1) meaningfully use the GPU or 2) use more than ~1.5 CPUs. I have to assume the issue is some kind of blocking/non-parallelizable part, but I can’t figure out what that would be.
A stripped down version of the model is here. (Input: Y.npy, M.npy.)
I have already tried:
- Making it explicitly an ODE. (This made it way slower.)
- Using tensorboard to try to figure out what was slow and it was super vague, just saying it was sleeping or blocking on some kind of I/O. (I unfortunately lost the results of that.)
- Getting rid of some of the data I’m storing (like the
Does anyone have any recommendations?
Cheers!