1 year ago
#351077
Arnd
How to avoid celery tasks interfering eachother?
Scenario
I have a celery task which can be triggered via API. The task involves
- Deleting all old database entries of a user
- Calculating results and
- storing results as new database entries
In a development environment this works pretty well. Tasks are processed one by one and the database is updated continuously, even if the task is triggered multiple times.
However in production, triggering the task multiple times will lead to errors basically saying database entries not being available.
Code
The Django API looks something like this:
class MyApi(APIView):
def post(self, request, format=None):
# delete all entries associated with user
user = User.objects.get(username=request.user)
old_entries = Entry.objects.filter(user=user)
old_entries.delete()
# create new entries using a celery task
tasks.create_entries().delay(user)
# return a success message
return HttpResponse("Success")
And the task does something like this:
#tasks.py
@shared_task
def create_entries(user):
# just assume I am doing smart calculations here and receive multiple results
results = calculate_smart_stuff(user)
# create Entry instance for each result
for result in results:
new_entry = Entry(data=result)
new_entry.save()
# return a Json Response
return results.__dict__
The celery worker is started automatically in a container like so:
celery -A core worker
Hypothesis
I figure the main differences between local setup an server are
- Calculation / Communication speed
- Number of workers (locally limited to 1 worker, because I am using Windows)
So I think the problem is that workers are processing tasks in parallel. While a first task is creating database entries, a second one is triggered and deletes entries the first one created. For obviuous reasons the first task cannot be completed.
Question How can I avoid this using celery?
- I could prevent the user from triggering a new task in the frontend, but I want to protect the backend as well.
- I could use a named worker, and assign this task to this worker only. But this will limit my performance when multiple users are using the API at the same time (I think?)
I would be greatful for any ideas which could point me in the right direction.
python
django-rest-framework
celery
worker
0 Answers
Your Answer