Celery distributed task processing

1. What is Clelery

Celery is a simple, flexible and reliable distributed system that handles a large number of messages

Asynchronous task queue focusing on real-time processing

It also supports task scheduling

Celery architecture

 

The architecture of Celery consists of three parts: message broker, task execution unit and task result store.

Message Oriented Middleware

Celery itself does not provide message services, but it can be easily integrated with message middleware provided by third parties. Including RabbitMQ, Redis, etc

Task execution unit

Worker is the task execution unit provided by Celery. Worker runs concurrently in distributed system nodes.

Task result storage

Task result store is used to store the results of tasks executed by workers. Celery supports storing task results in different ways, including AMQP, redis, etc

Version support

Celery version 4.0 runs on
        Python ❨2.7, 3.4, 3.5❩
        PyPy ❨5.4, 5.5❩
    This is the last version to support Python 2.7, and from the next version (Celery 5.x) Python 3.5 or newer is required.

    If you're running an older version of Python, you need to be running an older version of Celery:

        Python 2.6: Celery series 3.1 or earlier.
        Python 2.5: Celery series 3.0 or earlier.
        Python 2.4 was Celery series 2.2 or earlier.

    Celery is a project with minimal funding, so we don't support Microsoft Windows. Please don't open any issues related to that platform.

2. Usage scenario

Asynchronous task: submit time-consuming operation tasks to Celery for asynchronous execution, such as sending SMS / email, message push, audio and video processing, etc

Scheduled task: perform something regularly, such as daily data statistics

3. Installation and configuration of celery

pip install celery

Message Oriented Middleware: RabbitMQ/Redis

app=Celery('task name ', backend='xxx',broker='xxx ')

4.Celery executes asynchronous tasks

Basic use

Create project cellytest

Creating py files: celery_app_task.py

import celery
import time
# broker='redis://127.0.0.1:6379/2 'no password
backend='redis://:123456@127.0.0.1:6379/1'
broker='redis://:123456@127.0.0.1:6379/2'
cel=celery.Celery('test',backend=backend,broker=broker)
@cel.task
def add(x,y):
    return x+y


Create py file: add_task.py, add task

from celery_app_task import add
result = add.delay(4,5)
print(result.id)

Create a py file: run.py, execute the task, or use the command: cell worker - a cell_ app_ task -l info

Note: under windows: celery worker -A celery_app_task -l info -P eventlet

from celery_app_task import cel
if __name__ == '__main__':
    cel.worker_main()
    # cel.worker_main(argv=['--loglevel=info')

Create py file: result.py to view the task execution results

from celery.result import AsyncResult
from celery_app_task import cel

async = AsyncResult(id="e919d97d-2938-4d0f-9265-fd8237dc2aa3", app=cel)

if async.successful():
    result = async.get()
    print(result)
    # result.forget() # Delete result
elif async.failed():
    print('Execution failed')
elif async.status == 'PENDING':
    print('Task waiting to be executed')
elif async.status == 'RETRY':
    print('Retrying after task exception')
elif async.status == 'STARTED':
    print('The task has begun to be executed')

Execute add_task.py, add a task and get the task ID

Execute run.py, or execute the command: cell worker - a cell_ app_ task -l info

Execute result.py, check the task status and get the results

Multitask structure

pro_cel
    ├── celery_task# celery related folders
    │   ├── celery.py   # celery connection and configuration related files must be called this name
    │   └── tasks1.py    #  All task functions
    │   └── tasks2.py    #  All task functions
    ├── check_result.py # Inspection results
    └── send_task.py    # Trigger task

celery.py

from celery import Celery

cel = Celery('celery_demo',
             broker='redis://127.0.0.1:6379/1',
             backend='redis://127.0.0.1:6379/2',
             # It contains the following two task files. Go to the corresponding py file to find tasks and classify multiple tasks
             include=['celery_task.tasks1',
                      'celery_task.tasks2'
                      ])

# time zone
cel.conf.timezone = 'Asia/Shanghai'
# Use UTC
cel.conf.enable_utc = False

tasks1.py

import time
from celery_task.celery import cel

@cel.task
def test_celery(res):
    time.sleep(5)
    return "test_celery Task results:%s"%res

tasks2.py

import time
from celery_task.celery import cel
@cel.task
def test_celery2(res):
    time.sleep(5)
    return "test_celery2 Task results:%s"%res

check_result.py

from celery.result import AsyncResult
from celery_task.celery import cel

async = AsyncResult(id="08eb2778-24e1-44e4-a54b-56990b3519ef", app=cel)

if async.successful():
    result = async.get()
    print(result)
    # result.forget() # Delete the result. After execution, the result will not be deleted automatically
    # async.revoke(terminate=True)  # No matter what time it is, it will end
    # async.revoke(terminate=False) # If the task has not started, it can be terminated.
elif async.failed():
    print('Execution failed')
elif async.status == 'PENDING':
    print('Task waiting to be executed')
elif async.status == 'RETRY':
    print('Retrying after task exception')
elif async.status == 'STARTED':
    print('The task has begun to be executed')

send_task.py

from celery_task.tasks1 import test_celery
from celery_task.tasks2 import test_celery2

# Immediately inform celery to execute the test_ Cell task and pass in a parameter
result = test_celery.delay('Execution of the first')
print(result.id)
result = test_celery2.delay('Implementation of the second')
print(result.id)

Add a task (execute send_task.py), and start work: cell worker - a cell_ Task - L Info - P eventlet, check the task execution result (execute check_result.py)

5.Celery performs scheduled tasks

Set a time for celery to perform a task

add_task.py

from celery_app_task import add
from datetime import datetime

# Mode 1
# v1 = datetime(2019, 2, 13, 18, 19, 56)
# print(v1)
# v2 = datetime.utcfromtimestamp(v1.timestamp())
# print(v2)
# result = add.apply_async(args=[1, 3], eta=v2)
# print(result.id)

# Mode II
ctime = datetime.now()
# Default utc time
utc_ctime = datetime.utcfromtimestamp(ctime.timestamp())
from datetime import timedelta
time_delay = timedelta(seconds=10)
task_time = utc_ctime + time_delay

# Using apply_async and set the time
result = add.apply_async(args=[4, 3], eta=task_time)
print(result.id)

Timed tasks similar to contab

In the multitask structure, celery.py is modified as follows

from datetime import timedelta
from celery import Celery
from celery.schedules import crontab

cel = Celery('tasks', broker='redis://127.0.0.1:6379/1', backend='redis://127.0.0.1:6379/2', include=[
    'celery_task.tasks1',
    'celery_task.tasks2',
])
cel.conf.timezone = 'Asia/Shanghai'
cel.conf.enable_utc = False

cel.conf.beat_schedule = {
    # Name at will
    'add-every-10-seconds': {
        # Execute test under tasks1_ Celery function
        'task': 'celery_task.tasks1.test_celery',
        # Every 2 seconds
        # 'schedule': 1.0,
        # 'schedule': crontab(minute="*/1"),
        'schedule': timedelta(seconds=2),
        # Transfer parameters
        'args': ('test',)
    },
    # 'add-every-12-seconds': {
    #     'task': 'celery_task.tasks1.test_celery',
    #     It will be implemented at 8:42 on April 11 every year
    #     'schedule': crontab(minute=42, hour=8, day_of_month=11, month_of_year=4),
    #     'schedule': crontab(minute=42, hour=8, day_of_month=11, month_of_year=4),
    #     'args': (16, 16)
    # },
}

Start a beat: celery beat -A celery_task -l info

Start work execution: celery worker -A celery_task -l info -P eventlet

6. Using Celery in Django

Create cellyconfig.py in the project directory

import djcelery
djcelery.setup_loader()
CELERY_IMPORTS=(
    'app01.tasks',
)
#Some situations can prevent deadlock
CELERYD_FORCE_EXECV=True
# Set the number of concurrent worker s
CELERYD_CONCURRENCY=4
#Allow retry
CELERY_ACKS_LATE=True
# Each worker can execute up to 100 tasks and be destroyed to prevent memory leakage
CELERYD_MAX_TASKS_PER_CHILD=100
# Timeout
CELERYD_TASK_TIME_LIMIT=12*30

Create tasks.py in the app01 directory

from celery import task
@task
def add(a,b):
    with open('a.text', 'a', encoding='utf-8') as f:
        f.write('a')
    print(a+b)

View function views.py

from django.shortcuts import render,HttpResponse
from app01.tasks import add
from datetime import datetime
def test(request):
    # result=add.delay(2,3)
    ctime = datetime.now()
    # Default utc time
    utc_ctime = datetime.utcfromtimestamp(ctime.timestamp())
    from datetime import timedelta
    time_delay = timedelta(seconds=5)
    task_time = utc_ctime + time_delay
    result = add.apply_async(args=[4, 3], eta=task_time)
    print(result.id)
    return HttpResponse('ok')

settings.py

#INSTALLED_APPS = [
#    'djcelery',
#    'app01'
#]

from djagocele import celeryconfig
BROKER_BACKEND='redis'
BOOKER_URL='redis://127.0.0.1:6379/1'
CELERY_RESULT_BACKEND='redis://127.0.0.1:6379/2'

Tags: Python Distribution

Posted on Mon, 22 Nov 2021 17:42:32 -0500 by rednax