Thursday, April 27, 2017

Real-time analytics with Apache Storm

Types of analytics:
  • cube analytics: business intelligence
  • predictive analytics: statistics and machine learning
  • realtime: streaming or interactive
  • batch:
Hadoop: big batch processing
Storm: fast, reactive, real-time processing
Apache Storm Site with Documentation: https://storm.apache.org/

setup

Step 1) (Apple OSX) Install VirtualBox for your operating system: https://www.virtualbox.org/wiki/Downloads
Step 2) (Apple OSX) Install Vagrant for your operating system: https://www.vagrantup.com/
git clone https://github.com/Udacity/ud381
cd ud381
vagrant up   # 1st download 2G vmdk file to VB folder
vagrant ssh
storm version  # 0.9.2-incubting
cd ..
cd ..
cd vagrant  # this is a shared folder 
logout
vagrant ssh
cd /vagrant
cd lesson1/stage1
mvn clean
mvn package
tree
storm jar target/udacity-storm-lesson1_stage1-0.0.1-SNAPSHOT-jar-with-dependencies.jar udacity.storm.ExclamationTopology
There are a lot of Java implementation. I am not particularly interested in collecting Tweets.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.