October 26th, 2017

Apache PredictionIO: Easier machine learning with Spark

Data Storage and Management, Enterprise Java, Java App Dev, Libraries and Frameworks, others, Programing, by admin.

The Apache Foundation has added a new machine learning project to its roster, Apache PredictionIO, an open-sourced version of a project originally devised by a subsidiary of Salesforce.

What PredictionIO does for machine learning and Spark

Apache PredictionIO is built atop Spark and Hadoop, and serves Spark-powered predictions from data using customizable templates for common tasks. Apps send data to PredictionIO’s event server to train a model, then query the engine for predictions based on the model.

Spark, MLlib, HBase, Spray, and and Elasticsearch all come bundled with PredictionIO, and Apache offers supported SDKs for working in Java, PHP, Python, and Ruby. Data can be stored in a variety of back ends: JDBC, Elasticsearch, HBase, HDFS, and their local file systems are all supported out of the box. Back ends are pluggable, so a developer can create a custom back-end connector.

How PredictionIO templates make it easier to serve predictions from Spark

PredictionIO’s most notable advantage is its template system for creating machine learning engines. Templates reduce the heavy lifting needed to set up the system to serve specific kinds of predictions. They describe any third-party dependencies that might be needed for the job, such as the Apache Mahout machine-learning app framework.

Back Top

Leave a Reply