Installation on Kubernetes

XDM can be installed and executed on Kubernetes. XDM provides a Helm chart to easily configure and install XDM on Kubernetes. The following section describes how to get started with the XDM Helm chart and which options can be configured.

Prerequisites

The installation of XDM on Kubernetes, or OpenShift requires the following prerequisites:

A Kubernetes cluster.
A local installation of Helm version 3.
Helm must be configured to access the Kubernetes cluster.
A valid license key with the option "Cloud Installation" enabled. See Licensing for more information on how to obtain a license key.

Add Helm repository

Before you can use helm install you need to add the XDM chart repository to you current Helm installation. The following command adds the XDM repository:

helm repo add ubs https://www.ubs-hainer.com/downloads/XDM3/helm

Update the Helm chart

Each version of XDM will provide a separate version of the Helm chart. To update the Helm chart to the latest version, enter the following command:

helm repo update

This will download the latest version of the ubs chart repository.

Install chart

The following command installs the XDM Helm chart to the configured Kubernetes cluster:

helm install [NAME] ubs/xdm -f [values] --version [RELEASE] --namespace [NAMESPACE] --create-namespace

NAME: Specify the name of the installation. This name can be used in further helm commands to execute commands on the specific installation.
VALUES: Specify the name to the helm chart file. For more information on the values.yaml file, see Chart configuration.
RELEASE: Specify the XDM release that should be installed. If the --version parameter is omitted the latest stable XDM version will be installed. A valid release version looks like: 3.22.31.
NAMESPACE: Specify the Kubernetes namespace of the XDM installation. The Helm chart will store all deployments, pods, services, etc. in this namespace. If the command line argument --create-namespace is omitted the namespace must exist.

Use the following command to get a list of available XDM versions:

helm search repo xdm -l

OpenShift specific customization

In open shift it is not allowed to run the docker container as root user. Therefore, you have to customize your images to grant access to the internal directories and use ports > 1024 which can be bound by a none root user.

How to build a Docker image using a Dockerfile

Create a project directory

Make a new folder for your project and navigate to it:

bash
mkdir my-docker-project
cd my-docker-project

Create the docker file

Use one of the docker files that can be found in the installation package xdm3-installation.zip in directory dockerfiles. For example this is the Dockerfile-ui:

# Base image
FROM docker.ubs-hainer.com/xdm3-ui:latest

ENV http_port 8080
ENV https_port 8443

ARG UID=1000
ARG GID=1000

RUN addgroup -g ${GID} xdm &&\
    adduser -S -s /bin/bash -G xdm -h /home/xdm -u ${UID} xdm &&\
    chown ${UID}:${GID} /etc/nginx/conf.d/default.tmpl &&\
    chown ${UID}:${GID} /etc/nginx/nginx.conf &&\
    chmod 766 /etc/nginx/nginx.conf &&\
    chown -R ${UID}:${GID} /usr/share/nginx/html/ &&\
    mkdir -p /var/tmp/nginx/ &&\
    chown -R ${UID}:${GID} /var/tmp/nginx/ &&\
    mkdir -p /var/log/nginx/ &&\
    chown -R ${UID}:${GID} /var/log/nginx/ &&\
    chown -R ${UID}:${GID} /run/nginx/ &&\
    chown -R ${UID}:${GID} /var/lib/nginx &&\
    chown ${UID}:${GID} /xdm/certificates/xdm.key &&\
    chown ${UID}:${GID} /xdm/certificates/xdm.key &&\
    chown -R ${UID}:${GID} /etc/nginx &&\
    chmod -R 755 /var/lib/nginx/ &&\
    sed -i '1d' /etc/nginx/nginx.conf

# Run image as xdm user
USER xdm

You should customize these parameters:

Parameter	Description
FROM	specifies the name of the XDM image which will be used as base. The value latest will use the latest available image.
http_port	specifies the port which is used for http requests. This port will be bound with XDM’s user interface. The operating system in the container allows only ports > 1024 to be bound by a none root user.
https_port	specifies the port which is used for https requests. This port will be bound with XDM’s user interface. The operating system in the container allows only ports > 1024 to be bound by a none root user.
UID	specifies the ID of the user which should be used in the container. All required permissions will be granted for this user.
GID	specifies the ID of the group that should be used in the container. All required permissions will be granted for this group.

Parameter

Description

FROM

specifies the name of the XDM image which will be used as base. The value latest will use the latest available image.

http_port

specifies the port which is used for http requests. This port will be bound with XDM’s user interface. The operating system in the container allows only ports > 1024 to be bound by a none root user.

https_port

specifies the port which is used for https requests. This port will be bound with XDM’s user interface. The operating system in the container allows only ports > 1024 to be bound by a none root user.

UID

specifies the ID of the user which should be used in the container. All required permissions will be granted for this user.

GID

specifies the ID of the group that should be used in the container. All required permissions will be granted for this group.

Build the image

In your project directory, run:

bash
docker build -t my-image:latest .

This command builds the image with the tag my-image:latest.

Summary:

A Dockerfile describes how your Docker image should be built. Use docker build to create the image that you then use for the further installation.

Chart configuration

The XDM chart has a couple of settings that can be customized with a values file. Some of these settings, like a license key, must be specified, whereas some others are optional.

To configure the Helm chart for your local XDM installation, create an empty .yaml file. This YAML file must contain all the settings that need to be customized.

Optional Rootless configuration

For security reasons, it is recommended to run the XDM containers with non-root users. For more Information about the non-root user Configuration for XDM, see Rootless Container Configuration.

General settings

 repository: <repository-url>
 imagePullSecrets: <pull-secrets>
 defaultServiceAccount: <service-account>
 xdm:
    version: <release>
    timezone: <timezone>
    environment:
        id: <environment-id>
        name: <environment-name>
        color: <environment-color>
        contextPath: <environment-context-path>
    certificates:
        - <certificate-secret-name-1>
        - <certificate-secret-name-2>
        - ...

repository: Use this option if you want Kubernetes to pull the images from a different repository.
pull-secrets: A Kubernetes secret that specifies the credentials used to pull the images from the docker repository docker.ubs-hainer.com. The credentials are provided by UBS Hainer during the installation of XDM.
defaultServiceAccount: The name of the service account that is used by the XDM pods. This option allows the use of a default service account for all pods. There is no default for the service account. If the parameter is omitted, the default behavior of your cloud technology will apply.
release: The release version of XDM.
timezone: This property specifies the timezone of the XDM docker containers. The default is Europe/Berlin. If the XDM installation should use a different time zone specify one of the following values:

Table 1. Time zone names (not exhaustive)
Name	Description
UTC	Coordinated Universal Time
US/Pacific	United States Pacific Time (UTC-08:00)
US/Mountain	United States Mountain Time (UTC-07:00)
US/Central	United States Central Time (UTC-06:00)
US/Eastern	United States Eastern Time (UTC-05:00)
Europe/London	Western European Time (UTC+00:00)
Europe/Berlin	Central European Time (UTC+01:00)
Europe/Vilnius	Eastern European Time (UTC+02:00)
Asia/Tel_Aviv	Israel Standard Time (UTC+02:00)
Asia/Tokyo	Japan Standard Time (UTC+09:00)

environment-id: Set the id for the XDM installation. When using CasC, the environment-id differentiates between different XDM installations, for example for a production and test environment.
environment-name: Sets the environment name for the XDM installation. Each XDM installation can have a specific environment name that is displayed in the UI. This makes it easier to differentiate between different XDM installations, for example for a production and test environment.
environment-color: Set the primary color of the XDM UI. It is recommended to use a different coloring for the different XDM installations. This setting applies to both the classic XDM UI and the Prime UI.

For the classic XDM UI, the following predefined colors are available:

violet
red
blue
gray
green

For the Prime UI, the following predefined colors are available:

red
emerald
green
lime
orange
amber
yellow
teal
cyan
sky
blue
indigo
violet
purple
fuchsia
pink
rose
ubs
noir

Alternatively, you can use a valid CSS color definition. In the Prime UI, RGB and hex code color definitions are supported for custom colors. See Color palette and Color syntax for more information. Please consider that the text for the menu entries is always white, so the color should be chosen accordingly. For color names from CSS that are equal to the predefined colors, the predefined colors will be used.

In the classic XDM UI, the secondary column color (used for the background of the side menu footer) can be set using CSS color definitions with the variable secondary_environment_color. The default secondary color is black. In the Prime UI, secondary_environment_color is not supported. The side menu footer background cannot be configured separately.

environment-context-path: Depending on the environment in which XDM is running, it might be desirable to configure a context path under which the XDM user interface can be accessed.

The context path must start with a forward slash but must not end with a forward slash. It is also possible to configure multiple path elements.

If XDM is installed on the machine testdataserver, uses SSL encryption with the default port, and the context path is set to /xdm, then the user interface can be accessed under the following URL:

https://testdataserver/xdm/

Accessing the user interface will only work if a forward slash is present at the end of the URL.

certificate-secret-name: Name of a secret which contains certificates to import. Each certificate is contained inside the secret as a separate key.

Task execution

execution:
  cleanupCron: <cleanup-cron>
  concurrent: <concurrent-executions>
  retentionPeriod: <retention-period>

cleanup-cron: The executionRetentionPeriod setting defines a period of time that controls how long an execution should be kept after it is automatically deleted. This setting can be configured globally, at template level or for specific tasks/workflows.

The cron expression controls the execution of the process that cleans up the expired executions. It specifies what time the process that deletes old executions runs. All executions older than the specified executionRetentionPeriod will be deleted by this process.

By default, the cleanup process runs every day at midnight. The default cron expression is 0 0 0 * * ?.

concurrent-executions: XDM allows parallel execution of tasks, hence several processes run on the dataflow server at the same time. Since the computing capacity of the dataflow server is not unlimited, the maximum number of parallel running tasks should be restricted.

Specify the parameter to limit the number of concurrent running tasks. It controls how many tasks can run simultaneously. As soon as more task executions are triggered than the maximum allowed, the surplus tasks are collected in a queue and processed sequentially as capacity allows.

This parameter must be specified in the docker-compose.yml file. It must be set in the environment section of the core-server.

The default value for the parameter is 5. A value of 0 indicates that no task executions are queued. In this case each task execution is executed immediately.

Due to technical reasons, the maximum number of concurrent tasks in the dataflow server is 20. Therefore, the maximum value for this parameter is 20. A higher value has no effect on the number of parallel executed tasks and can lead to dataflow aborts.

retention-period

Specify that task and workflow executions should be deleted after the specified period of time. If an execution is older than the specified retention period, the XDM service will delete the task execution including the working directories and log files of that execution. The value must start with P for a date period or with PT with a time period. It is based on the ISO-8601 duration format.

Examples:

P2D: Period for 2 days.
PT15M: Period for 15 minutes.
PT1H10M: Period for 1 hour and 10 minutes.

XDM name:

executionRetentionPeriod

Type:

String

Default value:

-1 (no execution were deleted)

JDBC driver

XDM uses the JDBC API to access the source and target databases of your tasks. You must provide a suitable JDBC driver (version 4.0 or higher) for all database systems that you plan to use as the source or target for XDM tasks.

You can execute wget commands to download the required JDBC drivers. The Helm Chart will make sure that the JDBC drivers are mounted to their respective pods. The driver files must be stored in the directory /xdm/config/jdbc-drivers. You can use the wget option -P to specify the target directory, in which the files will be stored.

 xdm:
  jdbcDriversCommand: |-
      wget -P /xdm/config/jdbc-drivers/ -v --no-check-certificate https://jdbc.postgresql.org/download/postgresql-42.3.3.jar

License

XDM needs a valid license key, that is checked each startup. The content of the license key must be specified as follows:

 xdm:
    license: |
       LICENSE = 26,A10DD76D280184CC0FE65B06170AB838
       LIC0001 = COMPANY:UBS_HAINER_GMBH
       LIC0002 = PRODUCT:XDM3
       LIC0003 = CPUID:9999
       ....
    licenseSecret: <secret-name>

licenseSecret: As an alternative to storing the license as plaintext, a Kubernetes secret can be used. The name of the secret can be chosen freely, whereas the key name in which the license data is stored has to be 'license.txt'. The content of the secret should be base64 encoded to keep the line breaks in the license key.

Configuration as Code

XDM can be configured to use a Git repository as source. The repository is used to populate the objects of the XDM installation. XDM continuously monitors the Git repository and automatically applies any changes on the Git files to the XDM installation. In this case XDM objects can not only be created via manual clicks in the UI, but also via the code in Git.

Git configurations are written in YAML files. These can be written manually or generated by an export via the UI. The structure of the file tree, within the Git repository, is up to the user. XDM will parse all YAML files recursively over all directories. The user can create a single YAML file for each object, or combine objects within one file. The dependencies between the YAML files will be resolved by XDM before the files are applied to the XDM installation.

CasC enables the operators of an XDM installation to automatically set up a defined set of objects. Manual synchronization of several XDM installations is not necessary. Test and production instances are kept in a consistent state through the Git configuration.

XDM will only track changes to YAML files. If other files are committed to Git, XDM will not process them. If scripts, e.g. for workflows, hooks, environments, etc. are stored in separate files and are not part of the YAML file, you need to make an additional change to the lastChangedDate field in the corresponding YAML file. Otherwise, XDM will not take this change into account.

The following snippet shows how to enable and configure configuration as code:

 casc:
   enabled: true
   url: <url>
   user: <user>
   password: <password>
   secret: <secret-name>
   cron: <cron>
   path: <path>
   branch: <branch>
   owner: <owner>

url: Specifies the URL of the Git repository. This repository must be accessible via HTTP or HTTPS.
user: The user ID used to clone the Git repository. This setting is not required, if the git clone command does not require authentication.
password: The password of the Git user. This setting is not required, if no user is required, or the user does not require a password for authentication.
secret: If Git authentication is required, a Kubernetes secret can be used as an alternative to storing the credentials as plaintext. The name of the secret can be chosen freely. The secret must be of type kubernetes.io/basic-auth, or it must contain the keys 'username' and password'. If the user does not require a password for authentication, the 'password' key can be empty but must exist.
cron: A cron expression controls the interval at which the Git repository should be checked for changes. More details of the cron syntax can be found under Scheduling.
path: Specifies the directory path in the Git repository that should be monitored. This setting is optional. If the setting is not specified, XDM will monitor and apply all changes made in the Git repository. If a path is specified, XDM will only monitor and apply changes made in the specified directory.
branch: Specifies the branch that should be monitored. XDM will only apply changes that are committed to that specific branch. By default, this is set to master.
owner: Specifies an XDM user that will be the owner of all created objects in the XDM installation. These include all objects that are created by the Configuration as Code process. The specified username must exist in the used authentication provider. For example, if the XDM installation uses LDAP as authentication provider, the user must exist in the LDAP system.

This feature cannot be used with an Open ID based authentication provider like KeyCloak. These systems do not allow retrieval of the group membership for a specific user. The roles are required to check whether the respective user is allowed to create or overwrite objects in the XDM installation.

The specified user needs permissions for list and for creating all types of objects.

For more information how to create a list access permission, see the section list access permissions.

If the created objects should be shared with other users, appropriate permission must be granted to the set of users or role. The permission(s) can be specified in the YAML file of the respective object.

Database

XDM uses a PostgreSQL database to store information about its configuration. This includes the defined tasks, the history of task executions, your environment configuration, your user configuration, the cached object-containers, etc.

 database:
   extern: false
   url: <url>
   user: <user>
   password: <password>
   secret: <secret-name>
   schema: <schema>
   upgrade: <upgrade>
   sample: <sample>

extern

Controls if an internal Pod with a PostgreSQL database is started, or if an external database is used. If an external database is used, XDM will not start its own PostgreSQL database.

url

Specifies the JDBC URL of the PostgreSQL database that you want to use as administration database.

This option is only required if an external database is used. If the internal database is used, the URL is set to the correct value.

user

Specifies the username that is used to access the database.

password

Specifies the password that is used to access the database.

secret

As an alternative to storing the database credentials as plaintext, a Kubernetes secret can be used. The name of the secret can be chosen freely. The secret must be of type kubernetes.io/basic-auth, or it must contain the keys 'username' and password'.

schema

Specifies the schema name. This is set to public by default.

hostname

Specifies the host name of the PostgreSQL database that you want to use as administration database. This option is only required if an external database is used.

This is only required if the Grafana service is used.

port

Specifies the port of the PostgreSQL database that you want to use as administration database. This option is only required if an external database is used.

This is only required if the Grafana service is used.

name

Specifies the database name of the PostgreSQL database that you want to use as administration database. This option is only required if an external database is used.

This is only required if the Grafana service is used.

upgrade

XDM requires tables and sequences in the specified database. This option controls if the database objects are created automatically by XDM, or if the user needs to create them manually. If the value is set to automatic XDM will create these objects automatically. The specified user ID must have the permissions to create tables and sequences inside the database.

sample

Controls if an additional PostgreSQL database is started that contains sample data. This sample database is used in the XDM tutorials. By default, the sample database is not started.

Security

 security:
   requiredUserRole: <required-user-role>
   requiredUserRoleMessage: <required-user-role-message>
   adminRole: <admin-role>
   adminDefaultPermissions: <admin-default-permissions>
   purchaserRole: <purchaser-role>
   purchaserDefaultPermissions: <purchaser-default-permissions>
   systemObjectsRole: <system-objects-role>
   passwordSeed: <password-seed>
   secret: <secret-name>
   jwt:
     expireTime: <jwt-expire-time>
     validTime: <jwt-valid-time>
     secret: <jwt-secret>

required-user-role: Specifies a role to which the usage of XDM will be limited. This property enables the administrator to define a list of roles, which will restrict the access to XDM. In order to use XDM, an authorized user has to be a member of one of these roles. By default, all authorized users are able to use XDM.
required-user-role-message: This setting controls the message that is displayed if a user tried to log in, that is not a member of the roles specified by the property xdm.required-user-role. By default, the following message will be displayed: Missing one of the following roles: <roles set in xdm.required-user-role>
admin-role: Sets the names of the administrator role. The default name of the administrator role is ADMIN. When specifying multiple administrator roles, then use a comma separated list. This can be set to conform to your naming conventions. Every user who is in one or more of the roles in this list, will receive administrative privileges in XDM.
admin-default-permissions: Users that have administrative privileges in XDM can read and create all XDM objects and can grant privileges to other non-administrative users. The privileges of the administrative users can be customized with this property. One or more entries of the following options can be specified:

Privilege Description

Privilege	Description
`READ`	Allows administrators to see objects in lists, to see details about objects, and to request a data shop order.
`WRITE`	Allows administrators to modify attributes of an object.
`DELETE`	Allows administrators to delete an object.
`CREATE`	Allows administrators to create new objects in a list.
`EXECUTE`	Allows administrators to execute or schedule a task or workflow template, and to place a data shop order.
`ADMINISTRATION`	Allows administrators to grant permissions for an object to other users.
`SOURCE USAGE`	Allows an environment or a connection to be used as the source of a task.
`TARGET USAGE`	Allows an environment or a connection to be used as the target of a task.
`BROWSE`	For connections only. Allows administrators to see the contents of tables in the schema browser, and in the output of XDM tasks that provide a data preview.
`APPLY SQL`	For connections only. Allows administrators to execute SQL statements for tables in the schema browser, and in the output of XDM tasks that provide a data preview.
`DIAGNOSE`	For task templates, tasks, workflow templates and workflows only. Allows users to see diagnostic data like stage outputs or batch reports.
`SCRIPT`	For credentials only. Allows the usage of this credential in a task stage hook.
`MODIFY DATA`	For data reservation only. Allows modification of a data reservation.

READ

Allows administrators to see objects in lists, to see details about objects, and to request a data shop order.

WRITE

Allows administrators to modify attributes of an object.

DELETE

Allows administrators to delete an object.

CREATE

Allows administrators to create new objects in a list.

EXECUTE

Allows administrators to execute or schedule a task or workflow template, and to place a data shop order.

ADMINISTRATION

Allows administrators to grant permissions for an object to other users.

SOURCE USAGE

Allows an environment or a connection to be used as the source of a task.

TARGET USAGE

Allows an environment or a connection to be used as the target of a task.

BROWSE

For connections only. Allows administrators to see the contents of tables in the schema browser, and in the output of XDM tasks that provide a data preview.

APPLY SQL

For connections only. Allows administrators to execute SQL statements for tables in the schema browser, and in the output of XDM tasks that provide a data preview.

DIAGNOSE

For task templates, tasks, workflow templates and workflows only. Allows users to see diagnostic data like stage outputs or batch reports.

SCRIPT

For credentials only. Allows the usage of this credential in a task stage hook.

MODIFY DATA

For data reservation only. Allows modification of a data reservation.

purchaser-role: Specifies the role, of which the members will be treated as data shop users. These users don’t need full access to all functions of XDM and will receive a customized UI with which they can more easily order test data and see the results of their orders. When specifying multiple purchaser roles, then use a comma separated list.
purchaser-default-permissions: The permissions of the purchaser users can be adjusted with this property. The default setting of the property is READ and DELETE. This property is applied when a user from the purchaser role requests a data shop. The permissions set in the property are applied to the resulting execution. One or more entries of the following options can be separated by comma:

Privilege Description

Privilege	Description
`READ`	Allows purchaser to see objects in lists, to see details about objects, and to request a data shop order.
`WRITE`	Allows purchaser to modify attributes of an object.
`DELETE`	Allows purchaser to delete an object.
`ADMINISTRATION`	Allows purchaser to grant permissions for an object to other users.
`BROWSE`	For connections only. Allows purchaser to see the contents of tables in the schema browser, and in the output of XDM tasks that provide a data preview.
`DIAGNOSE`	For task templates, tasks, workflow templates and workflows only. Allows purchaser to see diagnostic data like stage outputs or batch reports.

READ

Allows purchaser to see objects in lists, to see details about objects, and to request a data shop order.

WRITE

Allows purchaser to modify attributes of an object.

DELETE

Allows purchaser to delete an object.

ADMINISTRATION

Allows purchaser to grant permissions for an object to other users.

BROWSE

For connections only. Allows purchaser to see the contents of tables in the schema browser, and in the output of XDM tasks that provide a data preview.

DIAGNOSE

For task templates, tasks, workflow templates and workflows only. Allows purchaser to see diagnostic data like stage outputs or batch reports.

system-objects-role: Specifies a role, that will be able to see and use the pre-defined matchers, comparable fields, and modification methods, that XDM ships with. By default, access to these pre-defined entities is restricted to the system, and only users with administrative permissions are able to see, use and change them.

Users that are a member of the specified role will be able to see and use the pre-defined matchers, comparable fields, and modification methods, however they will not be able to change them. These objects can only be changed by users with administrative permissions.

password-seed: Specifies a seed that is used to encrypt passwords.

XDM stores users and passwords. These credentials are used during a task execution to authorize against source and target database systems.

By default, the passwords are not encrypted, but rather stored as plain text in the XDM admin database.

After the seed was specified, XDM will encrypt all existing passwords stored in the database after a restart of the XDM core service. All newly created passwords will be encrypted automatically.

If the seed is changed, all previously stored passwords become invalid.

secret: As an alternative to storing the password-seed as plaintext a Kubernetes secret can be used. The name of the secret can be chosen freely but must contain the key 'passwordSeed'.
jwt-expire-time: Specifies how many minutes the JWT token is valid. This token is generated by XDM, After a user has logged in. By default, the expiration time is 120 minutes.
jwt-valid-time: Specifies the expiration time of the execution token. This is used to authenticate executions between core and dataflow. The default expiration time of this JWT is 1440 minutes.
jwt-secret: Specifies the secret, with which the JWT token is generated and encrypted.

Security context

A securityContext in Kubernetes defines security and access control settings for a pod or container. These settings can be used to control how the container runs, such as whether it runs as a root user, what user and group memberships it has, and what file system permissions it has.

The following represents a list of the most significant fields that may be used in a securityContext:

runAsUser: Specify the user ID under which the container is running.
runAsGroup: Specify the group ID under which the container is executed.
runAsNonRoot: Forces the container not to be executed as the root user.
fsGroup: Specify the group ID used for all files in the container’s file system.

<service>:
    securityContext:
        runAsUser: 1002
        runAsGroup: 1000
        runAsNonRoot: true
        fsGroup: 2000

service

The name of the service. Possible service are:

core
dataflow
file_sink
elasticsearch
generator_source
grafana
graph_store
loki
postgres
prometheus
sample
tempo
ui
webservice_apply_sink
webservice_extract_source
ai_assistance

Service account

In a Kubernetes environment, the service account section refers to the configuration of service accounts for a range of services within a Kubernetes cluster. A service account in Kubernetes is a specific type of account that pods use to facilitate communication with the Kubernetes API server.

In addition to the global defaultServiceAccount setting which applies to all containers / pods, the serviceAccount can be overridden for individual containers.

To specify a serviceAccount per container, use the following syntax:

<service>:
    serviceAccount: "xdm-<service>-account"

service

The name of the service for which the service account is configured. Possible services are:

core
dataflow
file_sink
elasticsearch
generator_source
grafana
graph_store
loki
postgres
prometheus
sample
tempo
ui
webservice_apply_sink
webservice_extract_source
ai_assistance

User management

Each user that intends to work with XDM needs to have a username and a password. The Authentication can either be performed by an LDAP server, OpenID provider or by using the internal user management of XDM.

User name

LDAP and OpenID authenticators provide additional user information upon successful login. This information can be used to adapt the username to a more meaningful value. By default, the 'givenName' property is used.

The following line will change the displayed username to the user’s family name:

 userManagement:
   fullNameProperty: family_name

Local

XDM offers a built-in user management system that allows you to maintain the usernames, passwords, and roles in a plain text file. Each line represents a separate user and must have the following format:

<user_name>;{<hash_method>}<password>;<roles>;<full_name>;<email>

user_name: Specifies the name of the user.
hash_method: Specifies the hash method for the password. This must be one of the following values:

argon2 - Argon2 password hash
bcrypt - BCrypt password hash
ldap - LDAP SHA password hash
SHA-256 - SHA-256 password hash

password

Specifies the hash sum of the password of the user. The password must be hashed with the previously specified hash method.

roles

Specifies a list of roles for that user. These roles can be used later while granting permissions.

full_name

The full name of the user. This property is used to identify the user in the graphical user interface. The full name is displayed in the user settings and is used to synchronize the display name of a permission recipient.

This field is optional, but required if the e-mail is to be specified.

email

The e-mail address of the user. The e-mail address can be accessed in the various JavaScript / Groovy scripts.

This field is optional, but required if the full name of the user is to be specified.

The following snippet shows how to specify the users in the Helm chart:

 userManagement:
   local: |
     # User name ; Password ; Roles
     admin;{noop}default;ADMIN
     user;{noop}default;USER
     tech;{noop}default;ADMIN

LDAP

XDM can use an LDAP server to authenticate users. The LDAP server controls which users exist and how they have to authenticate. The LDAP server also controls what roles a user is a member of.

 userManagement:
   ldap:
     enabled: true
     url: <ldap-server-url>
     searchFilter: <search-filter>
     searchBase: <search-base>
     group:
        searchBase: <group-search-base>
        searchFilter: <group-search-filter>
     manager:
        user: <manager-user>
        password: <manager-password>
        secret: <secret-name>

ldap-server-url: Set <hostname> and <port> to the host name and port of your LDAP server. Set <base_db> to the distinguished name of the entry that is starting point of the search. You can use the ldaps protocol instead of ldap. Example: ldap://ldap.mycompany.com:389/dc=mycompany,dc=com
search-filter: The search filter for the users. Example: uid={0}
search-base: The LDAP directory from which each search will start. If left empty, searches can take longer. Typically, cn=Users is a good value for search_base.
group-search-base: Defines the part of the directory tree, under which group searches should be performed.
group-search-filter: The filter that is used to search for group membership. The default is uniqueMember={0}.
manager-user: The username for the LDAP server, if it requires a login
manager-password: The password for the LDAP server, if it requires a login.
manager-secret: If the LDAP server requires a login, a Kubernetes secret can be used as an alternative to storing the login credentials as plaintext. The name of the secret can be chosen freely but must contain the keys 'username' and 'password'.

OAuth2

XDM supports the user authentication with an external OpenID Connect provider. XDM forwards login requests to the configured OpenID Connect provider and the user needs to log in at that system. After a successful login the OpenID system redirects to XDM. These settings must be configured with options that are described in this section.

As a prerequisite for using an OpenID provider with XDM, some information must be set and defined in the provider. Here the corresponding client to be used for the authentication within XDM must be defined and configured.

Parallel to the OpenID Connect authentication, the internal user management or LDAP authentication can also be used. When only OpenID Connect authentication should be used, it is possible to deactivate the internal user management by setting the internal user management property file.user to an empty value.

userManagement:
  oauth2:
    registration:
      <name>:
        client-id: <client-id>
        client-secret: <client-secret>
        client-name: <client-name>
        client-authentication-method: <client-authentication-method>
        authorization-grant-type: <authorization-grant-type>
        redirect-uri: <redirect-uri>
        scope: <scope>
    provider:
      <name>:
        authorization-uri: <authorization-uri>
        token-uri: <token-uri>
        jwk-set-uri: <jwk-set-uri>
        user-info-uri: <user-info-uri>
        user-info-authentication-method: <user-info-authentication-method>
        userNameAttribute: <userNameAttribute>
        issuer-uri: <issuer-uri>

name

Specifies the name of the provider, e.g. keycloak, okta, google, etc.

client-id

The ID that uniquely identifies the client.

client-secret

Client specific secret. If not specified, it’s supposed to be a public OpenID Connect provider.

client-name

A descriptive name used for the client. The name is displayed in the login page.

client-authentication-method

The authentication method used when authenticating the client with the authorization server. Valid values are:

client_secret_basic
client_secret_jwt
client_secret_post
none
private_key_jwt

authorization-grant-type

Authorization grant type specifies the method that the client uses to obtain an access token. Valid values are:

authorization_code
client_credentials
jwt_bearer
password
refresh_token

redirect-uri

The client’s registered redirect URI that the authorization server redirects the end-user’s user-agent to, after the end-user has authenticated and authorized access to the client. Typically, this is set to <base-url>/api/login/oauth2/code/<provider-name> where <base-url> is the base URL of your XDM installation.

scope

Sets the scope used for the client. Must be openid for the OpenID Connect protocol.

authorization-uri

The URI for the authorization endpoint.

token-uri

The URI for the token endpoint.

jwk-set-uri

The URI for the JSON Web Key (JWK) Set endpoint.

user-info-uri

The URI for the user info endpoint which provides additional details about the user.

user-info-authentication-method

The authentication method used when sending bearer access tokens in resource requests to resource servers. Valid values are:

FORM
HEADER
QUERY

userNameAttribute

The name of the attribute returned by the ID-token or by the UserInfo response that references the name or identifier of the end-user. Common values are: name, preferred_username, given_name or family_name.

issuer-uri

The URI used to initially configure a ClientRegistration using discovery of an OpenID Connect provider’s configuration endpoint.

Persistence

persistence:
  <volume>:
    storageClass: <storage-class>
    annotations: <annotations>
    labels: <labels>
    accessMode: <access-mode>
    size: <size>
    existingClaim: <existing-claim>

volume

The name of the volume. Possible volumes are:

data: Stores the XDM task working directories and task execution outputs like logs and reports.
postgres: Stores the data of the integrated PostgreSQL database. This volume is not necessary if an external PostgreSQL database is used.
sample: The data of the sample database. This volume does not exist if the sample database is disabled.
grafana: Stores the data of the Grafana service. This volume is only applicable if the Grafana service is enabled.
elasticsearch: Stores data of the Elasticsearch service. Elasticsearch is used to search through the XDM configuration objects. The volume is only required, if the global search is activated.

storage-class

The name of the Kubernetes storage class that is used for the persistent volume claim. The storage class setting will be set to an empty value if - is specified.

annotations

Annotations that are specified in the .metadata.annotations section of the persistent volume claim deployment.

labels

Labels that are specified in the .metadata.labels section of the persistent volume claim deployment.

accessMode

The access mode of the volume claim.For most pods, a value of ReadWriteOnce is sufficient. However, the dataflow and core pods need to share their directories. In this case a value of ReadWriteMany is needed.

We highly recommend using a cloud storage solution that supports the ReadWriteMany access mode. Without ReadWriteMany, the core and dataflow pods must run on the same node, which removes many of the advantages of Kubernetes (such as flexible scheduling and better resilience). If you still want to use this approach as a temporary workaround, you can configure Kubernetes podAffinity to ensure both pods are scheduled on the same node.

size: The size of the persistent volume.
existingClaim: Specify an existing volume claim. If an existing volume claim is used, XDM will not create a separate volume claim.

Ingress

The Helm Chart includes an ingress deployment to access the XDM UI.

 ingress:
    enabled: true
    domain: <domain>
    annotations: <annotations>
    tls: <tls>
    ingressClassName: <ingressClassName>

domain

Specifies the domain with which to access the XDM web UI. The Helm Chart creates the ingress deployment of this domain. To enable TLS/SSL protection, you need to specify the tls section to specify the configuration and secret name.

annotations

Annotations are specified on the ingress deployment. The following example illustrates how to use the annotation block. For example, a basic authentication dialog will be used for the XDM installation:

ingress:
  annotations:
    nginx.ingress.kubernetes.io/auth-type: basic
    nginx.ingress.kubernetes.io/auth-secret: basic-auth
    nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required - foo'

tls

Specifies the tls block of the ingress deployment. This section is necessary if access to XDM should be secured with https. In this section you need to set up the TLS configuration and the TLS/SSL certificate.

ingress:
  tls:
    - hosts:
        - xdm.eample.com
      secretName: xdm-tls

ingressClassName

Specifies the class name that refers to the respective ingress controller. This field is available for Kubernetes 1.18 or higher. In the following example we specify that the nginx controller should be used.

ingress:
  ingressClassName: nginx

For using this with Kubernetes before version 1.18 the annotation section should be used.

ingress:
  annotations:
    kubernetes.io/ingress.class: nginx

Prime UI

The Prime UI can be deployed in parallel with the current web UI and is accessed under the /prime/ path. To enable it, activate the Prime UI pod. The following example shows the required settings in the values.yaml file:

prime-ui:
  enabled: true

userManagement:
  oauth2:
    registration:
      <name>:
        redirect-uri: <redirect-uri>/api/login/oauth2/code/<provider>

If you use OpenID authentication, only one redirect URI can be stored. Therefore, logins initiated from Prime UI will redirect to the current web UI. After login, you can switch back to /prime/ in the browser.

Ensure the redirect-uri includes the full path, including everything from /api/… onward. Add /api/login/oauth2/code/<provider> and replace <provider> with the actual provider name, so the final redirect URL is complete.

AI Assistance

The AI Assistance service provides support to users by writing scripts like modification methods in XDM, by using a LLM.

ai_assistance:
    enabled: true

enabled: Controls if an own pod is started with the AI Assistance service.

To use the AI assistance feature of XDM, the LLM must be configured in the values.yml file. More information on how to configure the AI assistance service can be found here.

Grafana

Grafana is a cross-platform open source application for graphical representation of data from various data sources such as PostgreSQL or Prometheus.

The Grafana service uses tables of XDM’s admin database as a data source.

grafana:
    enabled: true
    jsonData:
      [...]

enabled: Controls if an own pod is started with Grafana. This is not necessary if no graphical display of statistic data is required.
jsonData: Optional parameter to define own jsonData for the built-in data source of the administration database. More details how to configure a PostgresQL data source can be found here.

If jsonData is specified, the database entry must be included within the jsonData block. If the database entry is missing, the Grafana Query Builder will not function properly. In particular, it will not display any tables or columns in the corresponding drop-down list.

Elasticsearch

Elasticsearch is a search engine that is used to search through XDM configuration objects. XDM uses this service to indexes all configuration objects. If the service is enabled a search field is displayed in the XDM UI. You can instantly search through XDM configuration objects and task execution logs.

elasticsearch:
    enabled: true
    password: xdm

enabled: Controls if an own pod is started with the Elasticsearch service.
password: Specifies the password for the built-in elastic search user.

Prometheus

Prometheus is a data/metric collector service. It stores metric values over time that can later be visualized by the reporting service of XDM3. Prometheus collects metrics from the java virtual machine (JVM) and from the Spring services that is running the XDM3 core and dataflow server.

prometheus:
    enabled: true

enabled: Controls if an own pod is started with Prometheus.

Task execution platforms

XDM allows the definition of different execution platforms. An own Kubernetes Pod is started for each task execution. The system settings of the Pod are based on the platform settings. XDM defines a default platform. The settings of the default platform can be customized as seen below. To add a new platform add a new sub-entry to the deployer.platform setting.

The platform name must not start with a number or contain special characters.

deployer:
  platform:
    default:
      limits:
        cpu: 500m
        memory: 1024Mi
      requests:
        cpu: 500m
        memory: 1024Mi

The following example adds the new platform example. Each task execution that runs on that platform can use up to 4GB main storage.

deployer:
  platform:
    example:
      limits:
        memory: 4096Mi
      requests:
        memory: 4096Mi

Storage

Storage settings specify where to store iceboxes and task directories. XDM supports storing in a local storage or cloud storage. By default, XDM stores the files on the local file system. XDM also supports several cloud storage systems:

Amazon AWS S3 compatible storage
Microsoft Azure blob store
Google Cloud buckets

To enable a cloud storage you must specify one of the options described in the following sections.

Amazon AWS S3 storage

To store the data in Amazon AWS S3 compatible storage, the following settings are available. To enable this type of storage you must specify a value of s3 for the type attribute.

storage:
  type: "s3"
  s3:
    bucket: "xdm"
    region: "eu-central-1"
    pathStyle: "false"
    endpoint: "s3.eu-central-1.amazonaws.com"
    credentials:
      accessKey: "AKIAIOSFODNN7EXAMPLE"
      secretKey: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

bucket: A S3 bucket is an object container. Objects can’t exist without a bucket. For details about buckets refer to Amazon’s documentation: Bucket overview and naming rules for buckets.
region: Region specifies the geographic location where your data will be stored. See regions and endpoints in the AWS General Reference.
pathStyle: 'pathStyle' specifies whether to use the path-style or virtual-hosted-style access for the S3 API. The default value is 'false', which means that the virtual-hosted-style access will be used. See Accessing Amazon S3 using path-style or virtual-hosted-style URLs.
endpoint: Together with the previous two properties, this property specifies the S3 API endpoint for object manipulation/access. See regions and endpoints in the AWS General Reference.
accessKey: An accessKey is like a username and is used for authentication of requests. See Managing access keys for IAM users.
secretKey: secretKey is like a password and is used together with accessKey to authenticate requests. See Managing access keys for IAM users.

Microsoft Azure blob store

To store the data in a Microsoft Azure blob store, the following settings are available. To enable this type of storage you must specify a value of azure for the type attribute.

Permissions

XDM requires several permissions to access the Azure blob service. The required privileges and resources are listed below. They must be specified in the Shared Access Signature (SAS) token creation dialog:

Type
Allowed services	Blob
Allowed resource types	Container, Object
Allowed permission	Read, Write, Delete, Add, Create

Type

Allowed services

Blob

Allowed resource types

Container, Object

Allowed permission

Read, Write, Delete, Add, Create

XDM will store the data in blob objects stored in a container. It needs access to the container and object resource types.

storage:
  type: "azure"
  azure:
    containerName: "xdm"
    endpoint: "https://xdmtest1.blob.core.windows/..."
    tokenSecret: "my-secret"

containerName: The name of the Azure container where the blobs will be stored. XDM will create the container if it does not already exist.
endpoint: The SAS URL to access the blob service.
tokenSecret: As an alternative to a token a Kubernetes secret can be used. The name of the secret can be chosen freely. The secret must contain a key named token that contains the SAS token.

Google Cloud Bucket

To store the data in a Google Cloud Platform compatible storage, the following settings are available. To enable this type of storage you must specify a value of gcp for the type attribute.

storage:
  type: "gcp"
  gcp:
    bucketName: "sample-bucket"
    projectId: ""
    keySecret: "my-secret"

bucketName: The name of the bucket that stores the data. The bucket will be created if it does not exist, otherwise the existing bucket is used to store files in it.
projectId: The ID of the respective Google Cloud project.
keySecret: A secret name that contains the access key of the service account in JSON format. The content of the secret must be stored with data.json.

LLM configuration

XDM can use a Large Language Model (LLM) for different purposes. The LLM configuration is used to specify the LLM provider that should be used.

llm:
  spring.ai.openai.base-url: <base-url>
  spring.ai.openai.api-key: <api-key-secret>
  spring.ai.openai.organization-id: <organization-id>
  spring.ai.openai.project-id: <project-id>
  spring.ai.openai.chat.options.model: <model>

base-url: Optional override for the spring.ai.openai.base-url to provide a chat-specific base URL.
api-key-secret: Specifies the API key secret that is used to authenticate against the LLM provider. The secret must provide a key named data, that contains the API key.
organization-id: Optionally, you can specify which organization to use for an API request.
project-id: Optionally, you can specify which project to use for an API request.
model: Optionally, you can specify which model to use for an API request. By default, gpt-4o-mini is used.