airflow详细安装过程
airflow是Airbnb开源出的一个数据流管理工具,关于使用,可参考官网http://pythonhosted.org/airflow/ 现将安装过程及踩过的坑分享给大家。
安装airflow(为了避免对其他程序造成影响,故不想替换掉原有的python2.6.6,此处希望2.6与2.7两个版本共存,而且安装的pip、virtualenv等软件,也只希望在python27中存在) 安装独立的python2.7,只需要在configure时指定prefix为不同的目录即可,这样make install时就会安装到prefix目录,而不是/usr/local/bin 1、下载python2.7.11源码,https://www.python.org/downloads/source/ 2、源码安装 su - root cd /usr/local/ tar -zxvf Python-2.7.11.tgz mv Python-2.7.11 python27 cd python27 ./configure --prefix=/usr/local/python27 #(修改为自己的路径) make make install 3、安装setuptools(需要将setuptools安装到python27下面,服务器不能连接外网,故下载源码) tar zvxf setuptools-23.1.0.tar.gz cd setuptools-23.1.0/ /usr/local/python27/python setup.py install4、安装pip(需要将pip安装到python27下面,服务器不能连接外网,故下载源码)(pypi可设置为豆瓣的库) tar zvxf pip-8.1.2.tar.gz cd pip-8.1.2/ /usr/local/python27/python setup.py install 5、安装virtualenv,其他安装方式参考官网https://virtualenv.pypa.io/en/latest/index.html tar zvxf virtualenv-15.0.2.tar.gz cd virtualenv-15.0.2/ /usr/local/python27/python setup.py install还需在 python2.6 下安装一次,否则在 python2.6 下创建 python2.7 的 virtualenv 时无法执行 6、由于执行virtualenv命令时,需要联网,所以还是需要设置代理,这里使用ccproxy 下载地址http://www.ccproxy.com/ 需要在linux上设置环境变量 export https_proxy=xxx.xxx.xxx.xxx:808 export http_proxy=xxx.xxx.xxx.xxx:808 7、使用virtualenv生成临时环境 virtualenv --pythonp=/usr/local/python27/bin/pythonairflowenv 这样 source airflowenv/bin/activate之后,就是使用python2.7的shell了 8、安装mysql,不做赘述 9、使用root用户安装mysql-devel,yum install mysql-devel 10、安装mysql-python,python官网下载MySQL-python-1.2.5.zip,解压缩 source airflowenv/bin/activate cd MySQL-python-1.2.5 python setup.py install 11、安装gevent source airflowenv/bin/activate pip install gevent 12、安装airflow source airflowenv/bin/activate export AIRFLOW_HOME=~/airflow (修改为自己的路径) pip install airflow # initialize the database airflow initdb13、vi $AIRFLOW_HOME/airflow.cfg文件 包括添加mysql的连接,设置executor等,其他参数请根据实际需要调整 executor = LocalExecutor sql_alchemy_conn = mysql://username:password@ip:port/dbname 14、再次执行airflowinitdb,此时将在mysql中创建表 15、安装supervisor,使用supervisor启动airflow,一旦airflow挂掉,supervisor会自动重启airflow source airflowenv/bin/activate pip install supervisor编辑supervisord.conf文件,指定要启动的程序和日志输出路径 [program:airflow_scheduler] command=/xxx/airflowenv/bin/airflow scheduler stdout_logfile=/tmp/airflow_scheduler.log 使用如下命令启动 supervisord -c /xxx/xxx/airflow/supervisord.conf 安装遇到的问题1、airflowinitdb报错 (airflowenv)root@127.0.0.1:/xxx/xxx/airflowenv/bin$ airflow initdb Traceback (most recent call last): File "/xxx/xxx/airflowenv/bin/airflow",line 4,in <module> from airflow import configuration File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/airflow/__init__.py",line 31,in <module> from airflow.models import DAG File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/airflow/models.py",line 56,in <module> from airflow import settings,utils File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/airflow/settings.py",line 76,in <module> engine = create_engine(SQL_ALCHEMY_CONN,**engine_args) File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/sqlalchemy/engine/__init__.py",line 386,in create_engine return strategy.create(*args,**kwargs) File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/sqlalchemy/engine/strategies.py",line 75,in create dbapi = dialect_cls.dbapi(**dbapi_args) File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/sqlalchemy/dialects/mysql/mysqldb.py",line 92,in dbapi return __import__('MySQLdb') ImportError: No module named MySQLdb 缺少mysql-python模块,官网下载MySQL-python-1.2.5.zip,解压缩, cd MySQL-python-1.2.5 python setup.py install 2、安装mysql-python后执行airflow initdb报错, _mysql.c:36:23: error:my_config.h: No such file or directory _mysql.c:38:19: error:mysql.h: No such file or directory _mysql.c:39:26: error:mysqld_error.h: No such file or directory _mysql.c:40:20: error:errmsg.h: No such file or directory linux缺少mysql-devel包,使用yum install mysql-devel,或手工下载mysql-devel的rpm包,自己安装 3、执行airflow webserver -p 8080启动webserver报错 Error: class uri 'gevent' invalid ornot found: [Traceback (most recent call last): File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/gunicorn/util.py",line 140,in load_class mod = import_module('.'.join(components)) File "/xxx/xxx/software/python27/lib/python2.7/importlib/__init__.py",line 37,in import_module __import__(name) File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/gunicorn/workers/ggevent.py",line 22,in <module> raise RuntimeError("You need gevent installed to use thisworker.") RuntimeError: You need geventinstalled to use this worker. ] 使用pip命令安装gevent pip install gevent (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |