presto-0.147+postgresql-9.5.3+msql-5.0.7+hadoop-2.5.2+hive-1
背景每个支持SQL的数据库,都有一个强大的SQL引擎。 而对于SQL引擎,基本都是大同小异的,负责SQL文法解析,语意分析,指定查询树,优化查询树,再到最终的执行,客户端返回结果。 而presto的也跟一般的是一样的。 架构如下:
准备1.postgresql-9.5.3 2.mysql-5.0.7 3.hadoop-2.5.2 4.hive-1.2.1 5.presto-server-0.147 6.presto-cli-0.147-executable.jar
且注意系统要求: Mac OS X or Linux Java 8 Update 60 or higher (8u60+),64-bit Maven 3.3.9+ (for building) Python 2.4+ (for running with the launcher script) 环境搭建_1mysql,postgresql都是在windows这边搭建的,直接就可以使用。 hadoop-2.5.2的搭建手顺之前的博文中已经记载了,此处不再说明。 hive的环境,解压后就可以使用了。 这里主要说一下hive的两种CLI工具: 1.hive shell 2.beeline 现在官网标记beeline是new,hive shell是older。建议使用beeline,结果的现实比较直观易懂。跟mysql的比较像。 beeline的启动,如果hive使用默认的debery数据库的话,请使用下面的方式启动 ./bin/beeline -u jdbc:hive2://另外,derby只能同时一个用户使用,否则会报错如下所示: Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database /home/myProject/apache-hive-1.2.1-bin/metastore_db. at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.privGetJBMSLockOnDB(Unknown Source) at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.getJBMSLockOnDB(Unknown Source) at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.boot(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source) at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source) at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source) at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source) at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source) at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source) at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source) at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source) at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source) at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source) at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source) at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source) at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source) at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source) ... 83 more Error applying authorization policy on hive configuration: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient 处理方法也很简单: rm -rf derby.log metastore_db
环境搭建_2presto
启动
结果MySQL presto:test_hive> select * from mysql.sqoop.t1; id | int_col | char_col ----+---------+---------- 1 | 1 | a 2 | 2 | b 4 | 4 | d 3 | 3 | c 5 | 5 | e (5 rows) Query 20160520_101400_00009_k46dt,FINISHED,1 node http://localhost:8080/query.html?20160520_101400_00009_k46dt Splits: 2 total,0 done (0.00%) CPU Time: 0.0s total,0 rows/s,0B/s,100% active Per Node: 0.0 parallelism,0B/s Parallelism: 0.0 0:29 [0 rows,0B] [0 rows/s,0B/s] postgresql presto:test_hive> select * from postgresql.public.test; id | name ----+------ 1 | lily 2 | Tom 3 | Jim (3 rows) Query 20160520_101503_00010_k46dt,1 node http://localhost:8080/query.html?20160520_101503_00010_k46dt Splits: 2 total,0% active Per Node: 0.0 parallelism,0B/s Parallelism: 0.0 0:02 [0 rows,0B/s] mysql&postgresql presto:test_hive> select id,char_col from mysql.sqoop.t1 union select id,name from postgresql.public.test; id | char_col ----+---------- 1 | lily 2 | Tom 3 | Jim 1 | a 2 | b 4 | d 3 | c 5 | e (8 rows) Query 20160520_101532_00011_k46dt,1 node http://localhost:8080/query.html?20160520_101532_00011_k46dt Splits: 6 total,2 done (33.33%) CPU Time: 0.0s total,107 rows/s,17% active Per Node: 0.0 parallelism,0B/s Parallelism: 0.0 0:28 [3 rows,0B/s] hive
--未完待续--- (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |