python (65.1k questions)
javascript (44.2k questions)
reactjs (22.7k questions)
java (20.8k questions)
c# (17.4k questions)
html (16.3k questions)
r (13.7k questions)
android (12.9k questions)
Pytorch - Using more GPUs and increasing batch size makes training slower in DistributedDataParallel
I am trying to implement StyuleGAN2. My code works well when I am just using single GPU to do the training. I would like to speed up the training by utlilizing 8 GPUs by using DistributedDataParallel....
wanghinc
Votes: 0
Answers: 0
Staging environment for distributed system
I am working on a distributed system that consists of several core components which help building data pipelines that are used to run data processing jobs on big amount of files. In order to run the j...
M.Alsioufi
Votes: 0
Answers: 1
Cassandra counter - consistency
I have a doubt about counters. I'm searching on the web but I'm not able to find an answer.
Let's suppose we're working with a COUNTER across multiple datacenters, and let's suppose we're working usin...
Carlo De Vita
Votes: 0
Answers: 0
Distributed system design for quota on API
I am designing an API which can be hit only a defined number of times based on the subscription plan. Below are the plans per account:
10M hits per year - $100
100M hits per year - $300
1G hits per ye...
Vikas Adyar
Votes: 0
Answers: 1