Training Networks parallely on multiple GPU

I need to average network parameters after each epoch ( Federated Learning). I tried using parallel for loops using python library but the param store doesn’t gets updated globally it seems. Any help would be highly appreciated. Thanks!