Apache Flink is a streaming data flow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. this is a sample application to consume output of vmstat command as a stream, so lets get hands dirty
mkdir flink-steaming-example
cd flink-steaming-example/
mkdir -p src/main/scala
cd src/main/scala/
vim FlinkStreamingExample.scala
import org.apache.flink.streaming.api.scala._
object FlinkStreamingExample{
def main(args:Array[String]){
val env = StreamExecutionEnvironment.getExecutionEnvironment
val socketVmStatStream = env.socketTextStream("ip-10-0-0-233",9000);
socketVmStatStream.print
env.execute()
}
}
cd -
vim build.sbt
name := "flink-streaming-examples"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies ++= Seq("org.apache.flink" %% "flink-scala" % "1.0.0",
"org.apache.flink" %% "flink-clients" % "1.0.0",
"org.apache.flink" %% "flink-streaming-scala" % "1.0.0")
sbt clean package
stream some test data
vmstat 1 | nc -l 9000
now submit flink job
bin/flink run /root/flink-steaming-example/target/scala-2.10/flink-streaming-examples_2.10-1.0.jar
03/20/2016 08:04:12 Job execution switched to status RUNNING.
03/20/2016 08:04:12 Source: Socket Stream -> Sink: Unnamed(1/1) switched to SCHEDULED
03/20/2016 08:04:12 Source: Socket Stream -> Sink: Unnamed(1/1) switched to DEPLOYING
03/20/2016 08:04:12 Source: Socket Stream -> Sink: Unnamed(1/1) switched to RUNNING
let see output of the job
tailf flink-1.0.0/log/flink-root-jobmanager-0-ip-10-0-0-233.out -- will see the vmstat output
No comments:
Post a Comment