Skip to content

EC Data Platform#

The EC Data Platform was created by DIGIT B1 as a service to the European Institutions. Its main goal is to support the European Institutions in their data science projects by provisioning and managing an underlying stable and secure cloud-based infrastructure.

The EC Data Platform offers the tools and infrastructure necessary for sharing, testing, developing, deploying and integrating a wide range of data sources, coupled with the analytical solutions to analyse, process, manipulate and explore your data. It is the architecture blueprint for Data Analytics and Business Intelligence services for the European Commission and partially accessible to Member States.

It is composed of three different flavours: Amazon AWS, Microsoft Azure and Open Source

Within the overarching EC Data Platform, DIGIT offers four main services: Cloud-Accelerated, Cloud-Agnostic, Linked Data Platform and DataWorkBench@Azure.

  • The Cloud-Accelerated offering is a web application which allows users to quickly deploy and manage cloud-related services and features such as AWS EC2, S3, RDS, and IAM. Besides these, additional managed infrastructure and software frameworks are provided to data engineers, data scientists, and data analysts for a variety of use cases.
  • The Cloud-Agnostic offering is a web application which allows users to easily deploy and manage containerized data science workloads. The applications used are open-source solutions such as: Jupyter Lab, R-Studio, MongoDB, PostgreSQL, Superset, Spark, Knime, MinIO, Virtuoso.
  • The DataWorkBench@Azure offering is a web application which allows users to proceed fast with their data analysis projects by directly provisioning data storage and machine learning applications in an environment that is by default suitable for the handling of Sensitive Non-Confidential (SNC) data.
  • The Linked Data offering is using Virtuoso to deploy a universal server cluster. Virtuoso facilitates the activity of bringing together data from different data sources with the view to accelerating the production of information.

Each offering will be made available to you in form of a DSL (Data Science Lab), which in turn provisions the required resources and services. The building blocks comprising the Cloud Accelerated, Cloud Agnostic and DataWorkBench offerings cover the three supportive pillars of the data science discipline:

  • Data Visualization
  • Data Query & Processing
  • Data Storage

They can be organized into nine categories, encompassing Amazon Managed Services (blue boxes) and opensource services (yellow boxes):

  • Data Analytics Workbench & Workflow orchestration (e.g., Amazon EC2 , Amazon WorkSpaces or R-Studio, Jupyter, H2o, Knime, Apache Airflow)
  • Machine Learning Workbench (e.g., Amazon SageMaker)
  • Database Solutions (e.g., Amazon RDS, Amazon DynamoDB, or MongoDB, PostgreSQL)
  • Data Lake Solutions (e.g., Amazon S3 or MinIO)
  • File Share Solutions (e.g., Amazon EFS)
  • Block Storage Solutions (e.g., Amazon EBS)
  • Big Data Analytics Solutions (e.g., Amazon Athena or Spark)
  • Search Analytics Solutions (e.g., OpenSearch)
  • Visualization Tools (e.g., Amazon QuickSight or Apache Superset, Metabase)

alt-text

The offered services range from scalable single and clustered virtual machines to databases, open-source frameworks, and commercial tools for statistical analysis.

The offerings are supported by Security and Governance provisions that comply with the Commission’s internal guidelines.