Introduction

The Cloud Agnostic Data Science Lab (DSL) service offered can be considered a set of building blocks, which constitute functional logical groups of applications and virtual machines. It is based on open-source components. The Cloud-Agnostic portal is a web application that can be accessed via EU Login, provided by the EC Data Platform, which allows users to easily deploy and manage containerized data science workloads. These containerized workloads, which are referred to as deployments, are individual instances of their respective services. These services can be applications such as Jupyter Lab or RStudio, or database tools such as MongoDB or PostgreSQL.

Inside the cloud agnostic portal, users have access to a service catalog where they can access and deploy all the available services. When launching these deployments, users can specify certain configuration options such as CPU and RAM allocation, or the credentials they want to use to access the deployments. After launching a deployment, users will be able to access it through a web browser where they can use an automatically generated link to access the deployed web-based application. Once the user no longer needs the deployment, they can terminate it without fear of losing data since these deployments are attached to persistent storage.

The underlying architecture has been designed and implemented based on the following core concepts:

  • Scalability: It should be possible to increase the workloads the platform can handle in a longer-term, pre-planned manner, if so desired.
  • Elasticity: The platform should have the ability to handle dynamic growth or decrease in workloads due to changing demand in an automated manner.
  • Security: User management should be enabled, allowing for authentication and authorization controls for users. The platform should ensure data confidentiality and integrity.
  • Availability: The platform should have a 90% availability during the year.
  • Reliability: The platform should have the ability to withstand faults and failures within regions or zones. The platform should also have the ability for disaster recovery.