scala spark-streaming整合kafka (spark 2.3 kafka 0.10)
?Maven组件如下:? ? <dependency> ?官网代码如下: ?pasting /* ? 运行以上代码出现如下错误等: ?Exception in thread "main" org.apache.kafka.common.config.ConfigException: Missing required configuration "bootstrap.servers" which has no default value. 解决方法: ? 由错误可见,是因为没有设置kafka相关参数。 ?把官网代码修改如下: package cn.xdf.userprofile.stream ?运行如下:
?启动kafka
? ? ?bin/kafka-server-start ./etc/kafka/server.properties &
[2018-10-22 11:24:14,748] INFO [GroupCoordinator 0]: Stabilized group group1 generation 1 (__consumer_offsets-40) (kafka.coordinator.group.GroupCoordinator)
[2018-10-22 11:24:14,761] INFO [GroupCoordinator 0]: Assignment received from leader for group group1 for generation 1 (kafka.coordinator.group.GroupCoordinator)
[2018-10-22 11:24:14,779] INFO Updated PartitionLeaderEpoch. New: {epoch:0,offset:0},Current: {epoch:-1,offset-1} for Partition: __consumer_offsets-40. Cache now contains 0 entries. (kafka.server.epoch.LeaderEpochFileCache)
[2018-10-22 11:28:19,010] INFO [GroupCoordinator 0]: Preparing to rebalance group group1 with old generation 1 (__consumer_offsets-40) (kafka.coordinator.group.GroupCoordinator)
[2018-10-22 11:28:19,013] INFO [GroupCoordinator 0]: Group group1 with generation 2 is now empty (__consumer_offsets-40) (kafka.coordinator.group.GroupCoordinator)
[2018-10-22 11:29:29,424] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 11 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-10-22 11:39:29,414] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 1 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-10-22 11:49:29,414] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 1 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
? ? 运行spark ?/usr/local/spark-2.3.0/bin/spark-submit --class cn.xdf.userprofile.stream.DirectKafka --master yarn --driver-memory 2g ? ? --num-executors 1 ? ? ?--executor-memory 2g ? ? --executor-cores 1 ?userprofile2.0.jar localhost:9092 test?
2018-10-22 11:28:16 INFO ?DAGScheduler:54 - Submitting 1 missing tasks from ResultStage 483 (ShuffledRDD[604] at reduceByKey at DirectKafka.scala:46) (first 15 tasks are for partitions Vector(1))
2018-10-22 11:28:16 INFO ?TaskSchedulerImpl:54 - Adding task set 483.0 with 1 tasks
2018-10-22 11:28:16 INFO ?TaskSetManager:54 - Starting task 0.0 in stage 483.0 (TID 362,localhost,executor driver,partition 1,PROCESS_LOCAL,7649 bytes)
2018-10-22 11:28:16 INFO ?Executor:54 - Running task 0.0 in stage 483.0 (TID 362)
2018-10-22 11:28:16 INFO ?ShuffleBlockFetcherIterator:54 - Getting 0 non-empty blocks out of 1 blocks
2018-10-22 11:28:16 INFO ?ShuffleBlockFetcherIterator:54 - Started 0 remote fetches in 0 ms
2018-10-22 11:28:16 INFO ?Executor:54 - Finished task 0.0 in stage 483.0 (TID 362). 1091 bytes result sent to driver
2018-10-22 11:28:16 INFO ?TaskSetManager:54 - Finished task 0.0 in stage 483.0 (TID 362) in 4 ms on localhost (executor driver) (1/1)
2018-10-22 11:28:16 INFO ?TaskSchedulerImpl:54 - Removed TaskSet 483.0,whose tasks have all completed,from pool?
2018-10-22 11:28:16 INFO ?DAGScheduler:54 - ResultStage 483 (print at DirectKafka.scala:47) finished in 0.008 s
2018-10-22 11:28:16 INFO ?DAGScheduler:54 - Job 241 finished: print at DirectKafka.scala:47,took 0.009993 s
-------------------------------------------
Time: 1540178896000 ms
-------------------------------------------
? ?启动生产者
[
[email?protected] kafka_2.11-1.0.0]# bin/kafka-console-producer.sh --topic test --broker-list localhost:9092
?
> ?hello you
? > ?hello me ? 查看结果:
(hello,2)
(me,1)
(you,1)
2018-10-22 11:57:08 INFO ?JobScheduler:54 - Finished job streaming job 1540180628000 ms.0 from job set of time 1540180628000 ms
2018-10-22 11:57:08 INFO ?JobScheduler:54 - Total delay: 0.119 s for time 1540180628000 ms (execution: 0.072 s)
2018-10-22 11:57:08 INFO ?ShuffledRDD:54 - Removing RDD 154 from persistence list
2018-10-22 11:57:08 INFO ?MapPartitionsRDD:54 - Removing RDD 153 from persistence list
2018-10-22 11:57:08 INFO ?BlockManager:54 - Removing RDD 153
2018-10-22 11:57:08 INFO ?BlockManager:54 - Removing RDD 154
2018-10-22 11:57:08 INFO ?MapPartitionsRDD:54 - Removing RDD 152 from persistence list
2018-10-22 11:57:08 INFO ?BlockManager:54 - Removing RDD 152
2018-10-22 11:57:08 INFO ?MapPartitionsRDD:54 - Removing RDD 151 from persistence list
2018-10-22 11:57:08 INFO ?BlockManager:54 - Removing RDD 151
2018-10-22 11:57:08 INFO ?KafkaRDD:54 - Removing RDD 150 from persistence list
2018-10-22 11:57:08 INFO ?BlockManager:54 - Removing RDD 150
(编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |