Updated 14/02/2024
Core DTE Modules

Big Data Analytics TOSCA templates

Image

Description

A set of TOSCA templates to deploy Big Data Analytics tools

TOSCA templates enable the description, in a cloud-agnostic way, of the virtual infrastructures needed in the available Big Data Analytics tools.

 

Target Audience
+
  • TOSCA template developers
License
+

Apache 2.0

Created by
+

Release Notes

In the previous release, the following templates were created:

  • KubeFlow: Template to deploy the Kubeflow machine learning (ML) workflows platform on top of Kubernetes.
  • Airflow: Template to deploy the Apache workflows system on top of Kubernetes.
  • CernVMFS: Install CernVMFS on a VM and mount a list of CernVM-FS repositories specified by the user.
  • Kafka: Deploy Kafka distributed event streaming platform on top of a Kubernetes cluster.
  • MLFlow: Deploy the MLFlow platform to manage the ML lifecycle in a single VM, with the possibility to store the artefacts in an external S3 (or MinIO) storage system.

In this release, the following templates have been created:

  • yProv: Deploy the yProv provenance service on top of a Kubernetes cluster using the yProv helm chart.
  • openEO: Deploy openEO on top of a Kubernetes cluster using the openEO argo Helm chart.
  • STAC: Deploy STAC catalog using PostgreSQL backend.
  • Horovod: Deploy a Horovod cluster following this install docs, launching 1 Front-end and a set of WN (with GPU) and another set of WNs (without GPU). In the case of GPU nodes, it installs the NVIDIA drivers and the NCCL 2 library. It creates a “horovod” user that can access passwordless SSH to all the nodes. It also installs NFS to share the /home directory from the FE to all the WNs.
  • EOEPCA ADES: Installs ADES on top of a Kubernetes cluster. Using the following documentation. It deploys the Processing profile deploying MinIO and the ZOO-Project DRU.
  • Ophidia: Installs a Jupyter-Ophidia-based environment on top of a Kubernetes Cluster.

WP5 and WP6 members have tested the templates before the release, and plans for the testing with DT use cases have been established.

Future Plans

Some of the templates are in an early stage (STAC, openEO and Ophidia) and need to be correctly tested by users with experience using these tools to validate the functionality of the deployed infrastructure. Other templates are more mature but may need some additions to improve them.