Monday, November 17, 2014

Run Predictive Machine Learning algorithms on Hadoop without even knowing Mapreduce. | Swarm of XeBees


Run Predictive Machine Learning algorithms on Hadoop without even knowing Mapreduce. | Swarm of XeBees

April 30, 2014 Data Scientists are very much familiar with working on tools like R, SAS etc. for them writing or converting algorithms into mapreduce is bit difficult. There are libraries such as Mahout is available which provides mapreduce implementation of many algorithms. you can not run your algorithm directly on a hadoop cluster. Before that you need to create a Data Model based on data and decide the values for some tweaking parameters and changing these parameters multiple time in hadoop job and running again and again is bit pain for a Data scientist, for a java developer it could be a fun. Data scientists can do Data modeling or model training in SAS/R very easily and efficiently. Cascading comes up with a solution where Data scientists can design their data model in R and then they can export it into a PMML ( what is this? we will see in a moment) file and run cascading job which will run this algorithm over hadoop cluster. We need to write here minimal code.

Read full article from Run Predictive Machine Learning algorithms on Hadoop without even knowing Mapreduce. | Swarm of XeBees

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.