django development - building distributed (multi node) task queue with cellery

Today, I'll show you how to use cellery to build a task queue with two nodes in django project (one master node and one child node; the master node publishes the task, and the child node receives the task and executes it. It's similar to setting up three or more nodes), using cellery, rabbitmq. The knowledge in cellery and rabbitmq will not be introduced here alone.

1. Basic environment of the project:
Two Ubuntu 18.04 virtual machines, Python 3.6.5, Django 2.0.4, cellery 3.1.26post2

2. Main node django project structure:

3. Configuration of cellery in settings.py:

import djcelery
# Both Queue and Exchange here involve the concepts in RabbitMQ, which will not be introduced here
from kombu import Queue, Exchange
djcelery.setup_loader()
BROKER_URL = 'amqp://test:test@192.168.43.6:5672/testhost'
CELERY_RESULT_BACKEND = 'amqp://test:test@192.168.43.6:5672/testhost'

CELERY_TASK_RESULT_EXPIRES=3600
CELERY_TASK_SERIALIZER='json'
CELERY_RESULT_SERIALIZER='json'
# CELERY_ACCEPT_CONTENT = ['json', 'pickle', 'msgpack', 'yaml']

CELERY_DEFAULT_EXCHANGE = 'train'
CELERY_DEFAULT_EXCHANGE_TYPE = 'direct'

CELERY_IMPORTS = ("proj.celery1.tasks", )

CELERY_QUEUES = (
  Queue('train', routing_key='train'),
  Queue('predict', routing_key='predict'),
)

4. Configuration in cellery.py:

# coding:utf8
from __future__ import absolute_import

import os

from celery import Celery
from django.conf import settings

# set the default Django settings module for the 'celery' program.

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings')

app = Celery('proj')

# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
# app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
app.autodiscover_tasks(settings.INSTALLED_APPS)


@app.task(bind=True)
def debug_task(self):
    print('Request: {0!r}'.format(self.request))

5. Configuration in proj / init.py:

from __future__ import absolute_import
from .celery import app as celery_app

6. Cellery1 / tasks.py: (tasks in the primary node will not be executed, only tasks in the child nodes will be executed)

from __future__ import absolute_import
from celery import task


@task
def do_train(x, y):
    return x + y

7.celery1/views.py:

from .tasks import do_train
class Test1View(APIView):
    def get(self, request):
        try:
            # The queue and routing Φ key here also involve knowledge in RabiitMQ
            # Key. Here, it controls which queue to send tasks to. The sub nodes execute the tasks in the corresponding queue through this
            ret = do_train.apply_async(args=[4, 2], queue="train", routing_key="train")
            # Get results
            data = ret.get()
        except Exception as e:
            return Response(dict(msg=str(e), code=10001))
        return Response(dict(msg="OK", code=10000, data=data))

8. Sub node directory structure:

9. In the child node, cellery1 / cellery.py:

from __future__ import absolute_import
from celery import Celery
CELERY_IMPORTS = ("celery1.tasks", )
app = Celery('myapp',
             # This involves the knowledge of RabbitMQ, which corresponds to the master node
             broker='amqp://test:test@192.168.43.6:5672/testhost',
             backend='amqp://test:test@192.168.43.6:5672/testhost',
             include=['celery1.tasks'])

app.config_from_object('celery1.config')

if __name__ == '__main__':
  app.start()

10. In the child node, cellery1 / config.py:

from __future__ import absolute_import
from kombu import Queue,Exchange
from datetime import timedelta

CELERY_TASK_RESULT_EXPIRES=3600
CELERY_TASK_SERIALIZER='json'
CELERY_RESULT_SERIALIZER='json'
CELERY_ACCEPT_CONTENT = ['json','pickle','msgpack','yaml']

CELERY_DEFAULT_EXCHANGE = 'train'
# exchange type can see relevant contents in RabbitMQ
CELERY_DEFAULT_EXCHANGE_TYPE = 'direct'

CELERT_QUEUES =  (
  Queue('train',exchange='train',routing_key='train'),
)

11. Child node cellery1 / tasks.py: (this is the task to be executed, each node can be different)

from __future__ import absolute_import
from celery1.celery import app


import time
from celery import task


@task
def do_train(x, y):
    """
    //train
    :param data:
    :return:
    """
    time.sleep(3)
    return dict(data=str(x+y),msg="train")

12. Start the cellery in the child node:
Cellery1 is a project, - Q train means to receive tasks from train

celery -A celery1 worker -l info -Q train

13. Start the django project in the master node:

python manage.py runserver

14. Use Postman to request the corresponding view

Request url: http://127.0.0.1:8000/api/v1/cellery1/test/
The returned result is:
{
    "msg": "OK",
    "code": 10000,
    "data": {
        "data": "6",
        "msg": "train"
    }
}

15. Problems encountered:
1) cell queue error: AttributeError: 'str' object has no attribute 'items'
Solution: return the redis library from 3.0 to 2.10, pip install redis==2.10
Solution reference link: https://stackoverflow.com/que...

Let's talk about it today. If you have any questions, welcome to exchange.

Tags: Linux Celery Django JSON RabbitMQ

Posted on Mon, 02 Dec 2019 19:43:06 -0500 by !jazz