Tuesday, November 18, 2014

Stream Processing w/ Spark Streaming


Stream Processing w/ Spark Streaming
Spark Streaming is a component of Spark that provides highly scalable, fault-tolerant streaming processing. 

val ssc = new StreamingContext(sparkUrl, "Tutorial", Seconds(1), sparkHome, Seq(jarFile))
we create a StreamingContext object by providing the Spark cluster URL, the batch duration we’d like to use for streams, the Spark home directory, and the list of JAR files that are necessary to run the program. “Tutorial” is a unique name given to this application to identify it the Spark’s web UI. We elect for a batch duration of 1 second.

We also need to set an HDFS for periodic checkpointing of the intermediate data.

Finally, we need to tell the context to start running the computation we have setup.
ssc.start()
    ssc.awaitTermination()

sbt/sbt package run

Read full article from Stream Processing w/ Spark Streaming

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.