加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 大数据 > 正文

大数据系列5:Pig – 大数据分析平台

发布时间:2020-12-14 02:40:02 所属栏目:大数据 来源:网络整理
导读:wget? http://mirror.bit.edu.cn/apache/pig/pig-0.11.1/pig-0.11.1.tar.gz tar? -xzvf pig-0.11.1.tar.gz sudo vi? /etc/profile 增加: export PIG_HOME=/home/ysc/pig-0.11.1 exportPATH=$PATH:$PIG_HOME/bin source? /etc/profile cp? conf/log4j.proper

wget?http://mirror.bit.edu.cn/apache/pig/pig-0.11.1/pig-0.11.1.tar.gz

tar?-xzvf pig-0.11.1.tar.gz

sudo vi?/etc/profile

增加:

export PIG_HOME=/home/ysc/pig-0.11.1

exportPATH=$PATH:$PIG_HOME/bin

source?/etc/profile

cp?conf/log4j.properties.template conf/log4j.properties

pig?--help

LocalMode

1pig?-x local

2java?-cp /home/ysc/pig-0.11.1/pig-0.11.1.jar org.apache.pig.Main -x local

MapreduceMode(Default):

1pig

2pig?-x mapreduce

3java?-cp /home/ysc/pig-0.11.1/pig-0.11.1.jar:/home/ysc/hadoop-1.2.1/conf org.apache.pig.Main

4java?-cp /home/ysc/pig-0.11.1/pig-0.11.1.jar:/home/ysc/hadoop-1.2.1/conf org.apache.pig.Main -x mapreduce

准备数据:

hadoop fs?-put /etc/passwd passwd

Interactive Mode:

进入Pig shell(Local或Mapreduce Mode):

pig(pig -x local)

grunt>?A = load 'passwd' using PigStorage(':');

grunt>?B = foreach A generate $0 as id;

grunt>?dump B;

Batch Mode:

编写脚本:

vi?id.pig

输入:

/* id.pig */

-- load the passwd file

A = load 'passwd' using PigStorage(':');

-- extract the user IDs

B = foreach A generate $0 as id;

-- write the results to a file name id.out

store B into 'id.out';

运行脚本(Local或Mapreduce Mode):

pig(pig -x local)?id.pig

查看结果:

hadoopfs?-cat id.out/part-m-00000

Pig使用HCatalog管理数据:

启动Metastore

hcat_server.sh start & (或:hive --service metastore &)

sudo vi?/etc/profile

增加:

export PIG_CLASSPATH=$HCAT_HOME/share/hcatalog/hcatalog-*.jar:

$HIVE_HOME/lib/hive-metastore-*.jar:$HIVE_HOME/lib/libthrift-*.jar:

$HIVE_HOME/lib/hive-exec-*.jar:$HIVE_HOME/lib/libfb303-*.jar:

$HIVE_HOME/lib/jdo2-api-*-ec.jar:$HIVE_HOME/lib/slf4j-api-*.jar

export PIG_OPTS=-Dhive.metastore.uris=thrift://host001:9083

???????source?/etc/profile

创建表:

??????????????hcat -e "CREATETABLE students (name STRING,age INT)??ROW FORMAT DELIMITED???FIELDS TERMINATED BY 't'???LINES TERMINATED BY'n'???STORED AS TEXTFILE;?"

准备数据:

???????vi students.txt

???????输入:

刘德华51

张学友52

刘亦菲41

杨尚川27

成龙???55

洪金宝52

林志玲40

???hadoop fs -put students.txt /user/ysc/students.txt

启动pig:

pig -Dpig.additional.jars=$PIG_CLASSPATH

存储数据:

??????students = LOAD '/user/ysc/students.txt' AS (name:chararray,age:int);

??????dump students;

STORE students INTO 'students' USING org.apache.hcatalog.pig.HCatStorer();

加载数据:

A= LOAD 'students' USING org.apache.hcatalog.pig.HCatLoader();
???????
dump A;

?

APDPlat旗下十大开源项目

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读