MLlib Machine Learning in Apache Spark
1 minute read ∼ Filed in : A paper noteThis paper present spark’s open-source distributed machine learning library.
Integrate ML with spark has benefits:
- Spark is designed with iterative computation, which facilicate the ML algorithm in nature.
- Low-level components improvement has performance gain in MLLIB
- Simplify the deployment
Features:
-
fast, distributed implementations of ML algorithms
-
algorithmic optimizations to reduce JVM garbage collection, communication costs,
-
Pipeline API: It address the overhead of pipeline construction.