Use case Description - Scenario 4 - Big data analytics#

Use case description - Problem statement#

As a data scientist, I need to perform data analytics operations on a large amount of data.

To meet the use case goals, the following tools from the portal will be leveraged:

Tool	Description	Key capability
Jupyter notebook + Spark	The Jupyter Notebook is a web application for creating and sharing documents that contain code, visualizations, and text. It can be used for data science, statistical modeling, machine learning, and much more. Used for spark.	- Trigger Spark execution - Perform advanced analytics
MinIO	MinIO offers high-performance, S3 compatible object storage. Native to Kubernetes, MinIO is the only object storage suite available on every public cloud, every Kubernetes distribution, the private cloud and the edge. MinIO is software-defined and is 100% open source under GNU AGPL v3.	- load and store the data