Manage AirFlow methods

@[toc]

Manage AirFlow methods

Supervisor, a process management tool

  1. Install process management tool supervisor to manage the airflow process

    Easy? Install supervisor? This method is not applicable to Python 3 installation (many problems will occur)
    echo_supervisord_conf > /etc/supervisord.conf
  • Edit the file supervisor.conf and add the startup command
    vi /etc/supervisord.conf
  • [program:airflow_web]
    command=/usr/bin/airflow webserver -p 8080

    [program:airflow_worker]
    command=/usr/bin/airflow worker

    [program:airflow_scheduler]
    command=/usr/bin/airflow scheduler

    >3. Start the supervisor service

    /usr/bin/supervisord -c /etc/supervisord.conf

    >4. At this time, you can use supervisor CTL to manage the airflow service

    supervisorctl start airflow_web
    supervisorctl stop airflow_web
    supervisorctl restart airflow_web
    supervisorctl stop all

    ### Process management tool systemd 
    >  1.  vim   /etc/sysconfig/airflow  # systemd needs to call this file. Generally, it defines the variable of airflow

    AIRFLOW_CONFIG=/root/airflow/airflow.cfg
    AIRFLOW_HOME=/root/airflow

    >2. Service name managed by VIM / usr / lib / SYSTEMd / system / airflow-webserver.service × systemctl
     >Other services can also be defined in this way

    [Unit]
    Description=Airflow webserver daemon
    After=network.target postgresql.service mysql.service redis.service
    Wants=postgresql.service mysql.service redis.service

    [Service]
    EnvironmentFile=/etc/sysconfig/airflow
    User=root
    Group=root
    Type=simple
    ExecStart=/bin/bash -c "export PATH=${PATH}:/usr/local/python3/bin/ ; /usr/local/python3/bin/airflow webserver -p 8080 --pid /root/airflow/service/webserver.pid -A /root/airflow/service/webserver.out -E /root/airflow/service/webserver.err -l /root/airflow/service/webserver.log"

    KillMode=process
    Restart=on-failure
    RestartSec=5s
    PrivateTmp=true

    [Install]
    WantedBy=multi-user.target

    >  3. systemctl  daemon-reload  #Loading service
    >  4. systemctl  status  airflow-webserver.service  #View the service status, which can be managed later
    
    ### Using scripts to manage airflow
    ```bash
    #!/bin/bash
    #=== This is the function about airflow webserver service ===
    webserver_status(){
        echo -e "\e[36m  Checking service status, please wait ... \e[0m"
        sleep  3
        Status=`ps -elf| grep "airflow[ -]webserver" |wc -l`
        if [ $Status -eq 0 ] ;then
            echo -e "\e[31m webserver is stop !!! \e[0m"
        else 
            echo -e "\e[32m webserver is running... \e[0m"
        fi
    }
    webserver_start(){
        echo  -e "\e[36m Starting airflow webserver ... \e[0m"
        sleep 1
        nohup /usr/local/python3/bin/airflow  webserver >> /root/airflow/service/webserver.log 2>&1 &
        webserver_status
    }
    webserver_stop(){
        echo  -e "\e[36m Stopping airflow webserver ... \e[0m"
        sleep 1
        /usr/bin/kill -9 `ps -elf| grep "airflow[ -]webserver" | grep -v grep |awk -F" " '{ print $4 }'`
        rm -rf /root/airflow/airflow-webserver.pid
        webserver_status
    }
    #=== This is the function about airflow scheduler service ===
    scheduler_status(){
        echo -e "\e[36m  Checking service status, please wait ... \e[0m"
        sleep  3
        Status=`ps -elf| grep "airflow[ -]scheduler" |wc -l`
        if [ $Status -eq 0 ] ;then
            echo -e "\e[31m scheduler is stop !!! \e[0m"
        else 
            echo -e "\e[32m scheduler is running... \e[0m"
        fi
    }
    scheduler_start(){
        echo  -e "\e[36m Starting airflow scheduler ... \e[0m"
        sleep 1
        nohup /usr/local/python3/bin/airflow  scheduler >> /root/airflow/service/scheduler.log 2>&1 &
        scheduler_status
    }
    scheduler_stop(){
        echo  -e "\e[36m Stopping airflow scheduler ... \e[0m"
        sleep 1
        /usr/bin/kill -9 `ps -elf| grep "airflow[ -]scheduler" | grep -v grep |awk -F" " '{ print $4 }'`
        rm -rf /root/airflow/airflow-scheduler.pid
        scheduler_status
    }
    #=== This is the function about airflow flower service ===
    flower_status(){
        echo -e "\e[36m  Checking service status, please wait ... \e[0m"
        sleep  3
        Status=`netstat  -anputl| grep 5555 | grep LISTEN | awk -F" " '{ print $7 }' | awk -F"/" '{ print $1 }' |wc -l`
        if [ $Status -eq 0 ] ;then
            echo -e "\e[31m flower is stop !!! \e[0m"
        else 
            echo -e "\e[32m flower is running... \e[0m"
        fi
    }
    flower_start(){
        echo  -e "\e[36m Starting airflow flower ... \e[0m"
        sleep 1
        nohup /usr/local/python3/bin/airflow  flower >> /root/airflow/service/flower.log 2>&1 &
        flower_status
    }
    flower_stop(){
        echo  -e "\e[36m Stopping airflow flower ... \e[0m"
        sleep 1
        /usr/bin/kill -9 `netstat  -anputl| grep 5555 | grep LISTEN | awk -F" " '{ print $7 }' | awk -F"/" '{ print $1 }'`
        rm -rf /root/airflow/airflow-flower.pid
        flower_status
    }
    #=== This is the function about airflow worker service ===
    worker_status(){
        echo -e "\e[36m  Checking service status, please wait ... \e[0m"
        sleep  3
        Status=`ps -elf| grep "airflow serve_logs" | grep -v grep | wc -l`
        celeryStatus=`ps -elf| grep celery |grep -v grep | wc -l`
        if [ $Status -eq 0 ] ;then
            if [ $celeryStatus -eq 0 ]; then
                echo -e "\e[31m worker is stop !!! \e[0m"
            else
               echo -e "\e[32m worker is running... \e[0m"
            fi
        else 
            echo -e "\e[32m worker is running... \e[0m"
        fi
    }
    worker_start(){
        echo  -e "\e[36m Starting airflow worker ... \e[0m"
        sleep 1
        nohup /usr/local/python3/bin/airflow  worker >> /root/airflow/service/worker.log 2>&1 &
        worker_status
    }
    worker_stop(){
        echo  -e "\e[36m Stopping airflow worker ... \e[0m"
        sleep 1
        /usr/bin/kill -9 `ps -elf| grep "airflow serve_logs" | grep -v grep |awk -F" " '{ print $4 }'`
        /usr/bin/kill -9 `ps -elf| grep celery |grep -v grep |awk -F" " '{ print $4 }'`
        rm -rf /root/airflow/airflow-worker.pid
        worker_status
    }
    
    #=== This is the startup option for the airflow service ===
    case "$2" in
      start)
        case "$1" in
          webserver)
            webserver_start
            ;;
          worker)
            worker_start
            ;;
          scheduler)
            scheduler_start
            ;;
          flower)
            flower_start
            ;;
          all)
            webserver_start
            scheduler_start
            flower_start
            worker_start
            ;;
          *)
            echo -e "\n A tool used for starting airflow servicesUsage: airflow.sh {webserver|worker|scheduler|flower|all}  {start|stop|status}"
            usage
            exit 2
          esac
        ;;
      stop)
        case "$1" in
          webserver)
            webserver_stop
            ;;
          worker)
            worker_stop
            ;;
          scheduler)
            scheduler_stop
            ;;
          flower)
            flower_stop
            ;;
          all)
            worker_stop
            flower_stop
            scheduler_stop
            webserver_stop
            ;;
          *)
            echo -e "\n A tool used for starting airflow servicesUsage: airflow.sh {webserver|worker|scheduler|flower|all}  {start|stop|status}"
            usage
            exit 3
          esac
        ;;
      status)
        case "$1" in
          webserver)
            webserver_status
            ;;
          worker)
            worker_status
            ;;
          scheduler)
            scheduler_status
            ;;
          flower)
            flower_status
            ;;
          all)
            webserver_status
            scheduler_status
            flower_status
            worker_status
            ;;
          *)
            echo -e "\n A tool used for starting airflow servicesUsage: airflow.sh {webserver|worker|scheduler|flower|all}  {start|stop|status}"
            usage
            exit 4
          esac
        ;;
      *)
        echo -e "\n A tool used for starting airflow servicesUsage: airflow.sh {webserver|worker|scheduler|flower|all}  {start|stop|status}"
        usage
        exit 1
    esac

    Transformation of obtaining log information

    1. Get into incubator-airflow/airflow/www/
    2. modify views.py
      stay class Airflow(BaseView)Add the following code to
      @expose('/logs')
      @login_required
      @wwwutils.action_logging
      def logs(self):
      BASE_LOG_FOLDER = os.path.expanduser(
      conf.get('core', 'BASE_LOG_FOLDER'))
      dag_id = request.args.get('dag_id')
      task_id = request.args.get('task_id')
      execution_date = request.args.get('execution_date')
      dag = dagbag.get_dag(dag_id)
      log_relative = "{dag_id}/{task_id}/{execution_date}".format(
      **locals())
      loc = os.path.join(BASE_LOG_FOLDER, log_relative)
      loc = loc.format(**locals())
      log = ""
      TI = models.TaskInstance
      session = Session()
      dttm = dateutil.parser.parse(execution_date)
      ti = session.query(TI).filter(
      TI.dag_id == dag_id, TI.task_id == task_id,
      TI.execution_date == dttm).first()
      dttm = dateutil.parser.parse(execution_date)
      form = DateTimeForm(data={'execution_date': dttm})
        if ti:
            host = ti.hostname
            log_loaded = False
    
            if os.path.exists(loc):
                try:
                    f = open(loc)
                    log += "".join(f.readlines())
                    f.close()
                    log_loaded = True
                except:
                    log = "*** Failed to load local log file: {0}.\n".format(loc)
            else:
                WORKER_LOG_SERVER_PORT = \
                    conf.get('celery', 'WORKER_LOG_SERVER_PORT')
                url = os.path.join(
                    "http://{host}:{WORKER_LOG_SERVER_PORT}/log", log_relative
                ).format(**locals())
                log += "*** Log file isn't local.\n"
                log += "*** Fetching here: {url}\n".format(**locals())
                try:
                    import requests
                    timeout = None  # No timeout
                    try:
                        timeout = conf.getint('webserver', 'log_fetch_timeout_sec')
                    except (AirflowConfigException, ValueError):
                        pass
    
                    response = requests.get(url, timeout=timeout)
                    response.raise_for_status()
                    log += '\n' + response.text
                    log_loaded = True
                except:
                    log += "*** Failed to fetch log file from worker.\n".format(
                        **locals())
    
            if not log_loaded:
                # load remote logs
                remote_log_base = conf.get('core', 'REMOTE_BASE_LOG_FOLDER')
                remote_log = os.path.join(remote_log_base, log_relative)
                log += '\n*** Reading remote logs...\n'
    
                # S3
                if remote_log.startswith('s3:/'):
                    log += log_utils.S3Log().read(remote_log, return_error=True)
    
                # GCS
                elif remote_log.startswith('gs:/'):
                    log += log_utils.GCSLog().read(remote_log, return_error=True)
    
                # unsupported
                elif remote_log:
                    log += '*** Unsupported remote log location.'
    
            session.commit()
            session.close()
    
        if PY2 and not isinstance(log, unicode):
            log = log.decode('utf-8')
    
        title = "Log"
    
        return wwwutils.json_response(log)
    >3. Restart the service, and access the url as follows:

    http://localhost:8085/admin/airflow/logs?task_id=run_after_loop&dag_id=example_bash_operator&execution_date=2018-01-11

    >You can get the log of this task in execution date = 2018-01-11
    
    ###Delete DAG
     >Since the deletion of dag does not expose the direct api officially, and the complete deletion involves multiple tables, it is concluded that the sql to delete dag is as follows

    set @dag_id = 'BAD_DAG';
    delete from airflow.xcom where dag_id = @dag_id;
    delete from airflow.task_instance where dag_id = @dag_id;
    delete from airflow.sla_miss where dag_id = @dag_id;
    delete from airflow.log where dag_id = @dag_id;
    delete from airflow.job where dag_id = @dag_id;
    delete from airflow.dag_run where dag_id = @dag_id;
    delete from airflow.dag where dag_id = @dag_id;

    ### Cluster management script
    #### Online script of cluster service
    ```bash
    #!/usr/bin/env bash
    function usage() {
        echo -e "\n A tool used for starting airflow services
    Usage: 200.sh {webserver|worker|scheduler|flower}
    "
    }
    
    PORT=8081
    ROLE=webserver
    ENV_ARGS=""
    check_alive() {
        PID=`netstat -nlpt | grep $PORT | awk '{print $7}' | awk -F "/" '{print $1}'`
        [ -n "$PID" ] && return 0 || return 1
    }
    
    check_scheduler_alive() {
        PIDS=`ps -ef | grep "/usr/local/bin/airflow scheduler" | grep "python" | awk '{print $2}'`
        [ -n "$PIDS" ] && return 0 || return 1
    }
    
    function get_host_ip(){
        local host=$(ifconfig | grep "inet " | grep "\-\->" | awk '{print $2}' | tail -1)
        if [[ -z "$host" ]]; then
            host=$(ifconfig | grep "inet " | grep "broadcast" | awk '{print $2}' | tail -1)
        fi
        echo "${host}"
    }
    
    start_service() {
        if [ $ROLE = 'scheduler' ];then
            check_scheduler_alive
        else
            check_alive
        fi
        if [ $? -ne 0 ];then
            nohup airflow $ROLE $ENV_ARGS > $BASE_LOG_DIR/$ROLE/$ROLE.log 2>&1 &
            sleep 5
            if [ $ROLE = 'scheduler' ];then
                check_scheduler_alive
            else
                check_alive
            fi
            if [ $? -ne 0 ];then
                echo "service start error"
                exit 1
            else
                echo "service start success"
                exit 0
            fi
        else
            echo "service alreay started"
            exit 0
        fi
    }
    
    function main() {
        if [ -z "${POOL}" ]; then
            echo "the environment variable POOL cannot be empty"
            exit 1
        fi
        source /data0/hcp/sbin/init-hcp.sh
        case "$1" in
            webserver)
                echo "starting airflow webserver"
                ROLE=webserver
                PORT=8081
                start_service
                ;;
            worker)
                echo "starting airflow worker"
                ROLE=worker
                PORT=8793
                local host_ip=$(get_host_ip)
                ENV_ARGS="-cn ${host_ip}@${host_ip}"
                start_service
                ;;
            flower)
                echo "starting airflow flower"
                ROLE=flower
                PORT=5555
                start_service
                ;;
            scheduler)
                echo "starting airflow scheduler"
                ROLE=scheduler
                start_service
                ;;     
            *)
                usage
                exit 1
        esac
    }
    
    main "$@"

    Cluster service offline script

    #!/usr/bin/env bash
    function usage() {
        echo -e "\n A tool used for stop airflow services
    Usage: 200.sh {webserver|worker|scheduler|flower}
    "
    }
    
    function get_host_ip(){
        local host=$(ifconfig | grep "inet " | grep "\-\->" | awk '{print $2}' | tail -1)
        if [[ -z "$host" ]]; then
            host=$(ifconfig | grep "inet " | grep "broadcast" | awk '{print $2}' | tail -1)
        fi
        echo "${host}"
    }
    
    function main() {
        if [ -z "${POOL}" ]; then
            echo "the environment variable POOL cannot be empty"
            exit 1
        fi
        source /data0/hcp/sbin/init-hcp.sh
        case "$1" in
            webserver)
                echo "stopping airflow webserver"
                cat $AIRFLOW_HOME/airflow-webserver.pid | xargs kill -9
                ;;
            worker)
                echo "stopping airflow worker"
                PORT=8793
                PID=`netstat -nlpt | grep $PORT | awk '{print $7}' | awk -F "/" '{print $1}'`
                kill -9 $PID
                local host_ip=$(get_host_ip)
                ps -ef | grep celeryd | grep ${host_ip}@${host_ip} | awk '{print $2}' | xargs kill -9
                ;;
            flower)
                echo "stopping airflow flower"
                PORT=5555
                PID=`netstat -nlpt | grep $PORT | awk '{print $7}' | awk -F "/" '{print $1}'`
                kill -9 $PID
                start_service
                ;;
            scheduler)
                echo "stopping airflow scheduler"
                PID=`ps -ef | grep "/usr/local/bin/airflow scheduler" | grep "python" | awk '{print $2}'`
                kill -9 $PID
                ;;     
            *)
                usage
                exit 1
        esac
    }
    
    main "$@"

    Modifying ariflow time zone

    By default, the utc time is used for airflow. In China's time zone, + 8 hours is the local time. Next, modify the airflow to China's time zone in an all-round way, and take you to change the source code of the airflow. Here, we mainly modify the version of the airflow to 1.10.0. Other versions are the same. You can refer to the modification

    1. Modify airflow.cfg in the directory of airflow home, and set
      default_timezone = Asia/Shanghai

    2. Enter the installation location of the airflow package, that is, the site packages location. The following modification files are relative locations
      This is where I install the airflow package (for your reference)
      cd /usr/local/python3/lib/python3.6/site-packages/airflow

    3. Modify utils/timezone.py

      #Add it under the line (line 27) of UTC = pentium.timezone ('utc '),
      from airflow import configuration as conf
      try:
      tz = conf.get("core", "default_timezone")
      if tz == "system":
      utc = pendulum.local_timezone()
      else:
      utc = pendulum.timezone(tz)
      except Exception:
      pass
       #Modify the utcnow() function (on line 69)
      Original code d = dt.datetime.utcnow() 
      Change to d = dt.datetime.now()
  • Modify utils/sqlalchemy.py
    #Add under the line (line 37) of UTC = pentium.timezone ('utc ')
    from airflow import configuration as conf
    try:
    tz = conf.get("core", "default_timezone")
    if tz == "system":
    utc = pendulum.local_timezone()
    else:
    utc = pendulum.timezone(tz)
    except Exception:
    pass
  • Comment cursor.execute in utils/sqlalchemy.py ("set time" zone = '+ 00:00' ") (line 124)

    >5. Modify WWW / templates / admin / master.html (line 31)
    ```python
     Put the code var UTCseconds = (x.getTime() + x.getTimezoneOffset()*60*1000); 
    Change to var UTCseconds = x.getTime();
    
    Code "timeformat": "H: I: S% UTC%",
    Change to "timeFormat":"H:i:s",
    1. Finally, restart the airflow web server

    Tags: Linux supervisor Session Python Celery

    Posted on Sun, 29 Dec 2019 11:12:27 -0500 by Angus