1 year ago

#351077

test-img

Arnd

How to avoid celery tasks interfering eachother?

Scenario

I have a celery task which can be triggered via API. The task involves

  • Deleting all old database entries of a user
  • Calculating results and
  • storing results as new database entries

In a development environment this works pretty well. Tasks are processed one by one and the database is updated continuously, even if the task is triggered multiple times.

However in production, triggering the task multiple times will lead to errors basically saying database entries not being available.

Code

The Django API looks something like this:

class MyApi(APIView):
    def post(self, request, format=None):

       # delete all entries associated with user
       user = User.objects.get(username=request.user)
       old_entries = Entry.objects.filter(user=user)
       old_entries.delete()

       # create new entries using a celery task
       tasks.create_entries().delay(user)

       # return a success message
       return HttpResponse("Success")

And the task does something like this:

#tasks.py

@shared_task
def create_entries(user):
    
    # just assume I am doing smart calculations here and receive multiple results
    results = calculate_smart_stuff(user)
    
    # create Entry instance for each result
    for result in results:
        new_entry = Entry(data=result)
        new_entry.save()

    # return a Json Response
    return results.__dict__ 

The celery worker is started automatically in a container like so:

celery -A core worker

Hypothesis

I figure the main differences between local setup an server are

  • Calculation / Communication speed
  • Number of workers (locally limited to 1 worker, because I am using Windows)

So I think the problem is that workers are processing tasks in parallel. While a first task is creating database entries, a second one is triggered and deletes entries the first one created. For obviuous reasons the first task cannot be completed.

Question How can I avoid this using celery?

  • I could prevent the user from triggering a new task in the frontend, but I want to protect the backend as well.
  • I could use a named worker, and assign this task to this worker only. But this will limit my performance when multiple users are using the API at the same time (I think?)

I would be greatful for any ideas which could point me in the right direction.

python

django-rest-framework

celery

worker

0 Answers

Your Answer

Accepted video resources