Monday, November 17, 2014

SAMOA - Scalable Advanced Massive Online Analysis | otnira golb!


SAMOA – Scalable Advanced Massive Online Analysis | otnira golb!

What is SAMOA? SAMOA is a tool to perform mining on big data streams. It is a distributed streaming machine learning  (ML) framework, i.e. it is a Mahout but for stream mining. SAMOA contains a programing abstraction for distributed streaming ML algorithms (refer to this post for stream ML definition) to enable development of new ML algorithms without dealing with the complexity of underlying streaming processing engines (SPE, such as Twitter Storm and S4).  SAMOA also provides extensibility in integrating new SPEs into the framework. These features allow SAMOA users to develop distributed streaming ML algorithms once and they can execute the algorithms in multiple SPEs, i.e. code the algorithms once and execute them in multiple SPEs. Why SAMOA? Big Data is always evolving and one of the ways to mine big data is by using the streaming ML paradigm. This paradigm implies that the corresponding ML model for data mining will utilize real-time feedback and ML model updates will be faster.

Read full article from SAMOA – Scalable Advanced Massive Online Analysis | otnira golb!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.