Installation on Kubernetes
XDM can be installed and executed on Kubernetes. XDM provides a Helm chart to easily configure and install XDM on Kubernetes. The following section describes how to get started with the XDM Helm chart and which options can be configured.
Prerequisites
The installation of XDM on Kubernetes, or OpenShift requires the following prerequisites:
-
A Kubernetes cluster.
-
A local installation of Helm version 3.
-
Helm must be configured to access the Kubernetes cluster.
-
A valid license key with the option "Cloud Installation" enabled. See Licensing for more information on how to obtain a license key.
Add Helm repository
Before you can use helm install you need to add the XDM chart repository to you current Helm installation.
The following command adds the XDM repository:
helm repo add ubs https://www.ubs-hainer.com/downloads/XDM3/helm
Update the Helm chart
Each version of XDM will provide a separate version of the Helm chart. To update the Helm chart to the latest version, enter the following command:
helm repo update
This will download the latest version of the ubs chart repository.
Install chart
The following command installs the XDM Helm chart to the configured Kubernetes cluster:
helm install [NAME] ubs/xdm -f [values] --version [RELEASE] --namespace [NAMESPACE] --create-namespace
- NAME
-
Specify the name of the installation. This name can be used in further helm commands to execute commands on the specific installation.
- VALUES
-
Specify the name to the helm chart file. For more information on the values.yaml file, see Chart configuration.
- RELEASE
-
Specify the XDM release that should be installed. If the
--versionparameter is omitted the latest stable XDM version will be installed. A valid release version looks like:3.22.31. - NAMESPACE
-
Specify the Kubernetes namespace of the XDM installation. The Helm chart will store all deployments, pods, services, etc. in this namespace. If the command line argument
--create-namespaceis omitted the namespace must exist.
Use the following command to get a list of available XDM versions:
helm search repo xdm -l
OpenShift specific customization
In open shift it is not allowed to run the docker container as root user. Therefore, you have to customize your images to grant access to the internal directories and use ports > 1024 which can be bound by a none root user.
How to build a Docker image using a Dockerfile
Create a project directory
Make a new folder for your project and navigate to it:
bash
mkdir my-docker-project
cd my-docker-project
Create the docker file
Use one of the docker files that can be found in the installation package xdm3-installation.zip in directory
dockerfiles. For example this is the Dockerfile-ui:
# Base image
FROM docker.ubs-hainer.com/xdm3-ui:latest
ENV http_port 8080
ENV https_port 8443
ARG UID=1000
ARG GID=1000
RUN addgroup -g ${GID} xdm &&\
adduser -S -s /bin/bash -G xdm -h /home/xdm -u ${UID} xdm &&\
chown ${UID}:${GID} /etc/nginx/conf.d/default.tmpl &&\
chown ${UID}:${GID} /etc/nginx/nginx.conf &&\
chmod 766 /etc/nginx/nginx.conf &&\
chown -R ${UID}:${GID} /usr/share/nginx/html/ &&\
mkdir -p /var/tmp/nginx/ &&\
chown -R ${UID}:${GID} /var/tmp/nginx/ &&\
mkdir -p /var/log/nginx/ &&\
chown -R ${UID}:${GID} /var/log/nginx/ &&\
chown -R ${UID}:${GID} /run/nginx/ &&\
chown -R ${UID}:${GID} /var/lib/nginx &&\
chown ${UID}:${GID} /xdm/certificates/xdm.key &&\
chown ${UID}:${GID} /xdm/certificates/xdm.key &&\
chown -R ${UID}:${GID} /etc/nginx &&\
chmod -R 755 /var/lib/nginx/ &&\
sed -i '1d' /etc/nginx/nginx.conf
# Run image as xdm user
USER xdm
You should customize these parameters:
| Parameter | Description |
|---|---|
FROM |
specifies the name of the XDM image which will be used as base. The value latest will use the latest available image. |
http_port |
specifies the port which is used for http requests. This port will be bound with XDM’s user interface. The operating system in the container allows only ports > 1024 to be bound by a none root user. |
https_port |
specifies the port which is used for https requests. This port will be bound with XDM’s user interface. The operating system in the container allows only ports > 1024 to be bound by a none root user. |
UID |
specifies the ID of the user which should be used in the container. All required permissions will be granted for this user. |
GID |
specifies the ID of the group that should be used in the container. All required permissions will be granted for this group. |
Chart configuration
The XDM chart has a couple of settings that can be customized with a values file. Some of these settings, like a license key, must be specified, whereas some others are optional.
To configure the Helm chart for your local XDM installation, create an empty .yaml file.
This YAML file must contain all the settings that need to be customized.
General settings
repository: <repository-url>
imagePullSecrets: <pull-secrets>
defaultServiceAccount: <service-account>
xdm:
version: <release>
timezone: <timezone>
environment:
id: <environment-id>
name: <environment-name>
color: <environment-color>
contextPath: <environment-context-path>
certificates:
- <certificate-secret-name-1>
- <certificate-secret-name-2>
- ...
- repository
-
Use this option if you want Kubernetes to pull the images from a different repository.
- pull-secrets
-
A Kubernetes secret that specifies the credentials used to pull the images from the docker repository
docker.ubs-hainer.com. The credentials are provided by UBS Hainer during the installation of XDM. - defaultServiceAccount
-
The name of the service account that is used by the XDM pods. This option allows the use of a default service account for all pods. There is no default for the service account. If the parameter is omitted, the default behavior of your cloud technology will apply.
- release
-
The release version of XDM.
- timezone
-
This property specifies the timezone of the XDM docker containers. The default is
Europe/Berlin. If the XDM installation should use a different time zone specify one of the following values:
| Name | Description |
|---|---|
UTC |
Coordinated Universal Time |
US/Pacific |
United States Pacific Time (UTC-08:00) |
US/Mountain |
United States Mountain Time (UTC-07:00) |
US/Central |
United States Central Time (UTC-06:00) |
US/Eastern |
United States Eastern Time (UTC-05:00) |
Europe/London |
Western European Time (UTC+00:00) |
Europe/Berlin |
Central European Time (UTC+01:00) |
Europe/Vilnius |
Eastern European Time (UTC+02:00) |
Asia/Tel_Aviv |
Israel Standard Time (UTC+02:00) |
Asia/Tokyo |
Japan Standard Time (UTC+09:00) |
- environment-id
-
Set the id for the XDM installation. When using CasC, the environment-id differentiates between different XDM installations, for example for a production and test environment.
- environment-name
-
Sets the environment name for the XDM installation. Each XDM installation can have a specific environment name that is displayed in the UI. This makes it easier to differentiate between different XDM installations, for example for a production and test environment.
- environment-color
-
Set the primary color of the XDM UI. It is recommended to use a different coloring for the different XDM installations. This setting applies to both the classic XDM UI and the Prime UI.
For the classic XDM UI, the following predefined colors are available:
-
violet
-
red
-
blue
-
gray
-
green
For the Prime UI, the following predefined colors are available:
-
red
-
emerald
-
green
-
lime
-
orange
-
amber
-
yellow
-
teal
-
cyan
-
sky
-
blue
-
indigo
-
violet
-
purple
-
fuchsia
-
pink
-
rose
-
ubs
-
noir
Alternatively, you can use a valid CSS color definition. In the Prime UI, RGB and hex code color definitions are supported for custom colors. See Color palette and Color syntax for more information. Please consider that the text for the menu entries is always white, so the color should be chosen accordingly. For color names from CSS that are equal to the predefined colors, the predefined colors will be used.
In the classic XDM UI, the secondary column color (used for the background of the side menu footer) can be set using CSS color definitions with the variable secondary_environment_color. The default secondary color is black.
In the Prime UI, secondary_environment_color is not supported. The side menu footer background cannot be configured separately.
- environment-context-path
-
Depending on the environment in which XDM is running, it might be desirable to configure a context path under which the XDM user interface can be accessed.
The context path must start with a forward slash but must not end with a forward slash. It is also possible to configure multiple path elements.
If XDM is installed on the machine testdataserver, uses SSL encryption with the default port, and the context path is set to /xdm, then the user interface can be accessed under the following URL:
https://testdataserver/xdm/
| Accessing the user interface will only work if a forward slash is present at the end of the URL. |
- certificate-secret-name
-
Name of a secret which contains certificates to import. Each certificate is contained inside the secret as a separate key.
Task execution
execution:
cleanupCron: <cleanup-cron>
concurrent: <concurrent-executions>
retentionPeriod: <retention-period>
- cleanup-cron
-
The
executionRetentionPeriodsetting defines a period of time that controls how long an execution should be kept after it is automatically deleted. This setting can be configured globally, at template level or for specific tasks/workflows.
The cron expression controls the execution of the process that cleans up the expired executions.
It specifies what time the process that deletes old executions runs.
All executions older than the specified executionRetentionPeriod will be deleted by this process.
By default, the cleanup process runs every day at midnight. The default cron expression is 0 0 0 * * ?.
- concurrent-executions
-
XDM allows parallel execution of tasks, hence several processes run on the dataflow server at the same time. Since the computing capacity of the dataflow server is not unlimited, the maximum number of parallel running tasks should be restricted.
Specify the parameter to limit the number of concurrent running tasks. It controls how many tasks can run simultaneously. As soon as more task executions are triggered than the maximum allowed, the surplus tasks are collected in a queue and processed sequentially as capacity allows.
This parameter must be specified in the docker-compose.yml file. It must be set in the environment section of the core-server.
The default value for the parameter is 5. A value of 0 indicates that no task executions are queued. In this
case each task execution is executed immediately.
| Due to technical reasons, the maximum number of concurrent tasks in the dataflow server is 20. Therefore, the maximum value for this parameter is 20. A higher value has no effect on the number of parallel executed tasks and can lead to dataflow aborts. |
- retention-period
-
Specify that task and workflow executions should be deleted after the specified period of time. If an execution is older than the specified retention period, the XDM service will delete the task execution including the working directories and log files of that execution. The value must start with
Pfor a date period or withPTwith a time period. It is based on the ISO-8601 duration format.Examples:
-
P2D: Period for 2 days. -
PT15M: Period for 15 minutes. -
PT1H10M: Period for 1 hour and 10 minutes.XDM name:
executionRetentionPeriod
Type:
String
Default value:
-1 (no execution were deleted)
-
JDBC driver
XDM uses the JDBC API to access the source and target databases of your tasks. You must provide a suitable JDBC driver (version 4.0 or higher) for all database systems that you plan to use as the source or target for XDM tasks.
You can execute wget commands to download the required JDBC drivers.
The Helm Chart will make sure that the JDBC drivers are mounted to their respective pods.
The driver files must be stored in the directory /xdm/config/jdbc-drivers.
You can use the wget option -P to specify the target directory, in which the files will be stored.
xdm:
jdbcDriversCommand: |-
wget -P /xdm/config/jdbc-drivers/ -v --no-check-certificate https://jdbc.postgresql.org/download/postgresql-42.3.3.jar
License
XDM needs a valid license key, that is checked each startup. The content of the license key must be specified as follows:
xdm:
license: |
LICENSE = 26,A10DD76D280184CC0FE65B06170AB838
LIC0001 = COMPANY:UBS_HAINER_GMBH
LIC0002 = PRODUCT:XDM3
LIC0003 = CPUID:9999
....
licenseSecret: <secret-name>
- licenseSecret
-
As an alternative to storing the license as plaintext, a Kubernetes secret can be used. The name of the secret can be chosen freely, whereas the key name in which the license data is stored has to be 'license.txt'. The content of the secret should be base64 encoded to keep the line breaks in the license key.
Configuration as Code
XDM can be configured to use a Git repository as source. The repository is used to populate the objects of the XDM installation. XDM continuously monitors the Git repository and automatically applies any changes on the Git files to the XDM installation. In this case XDM objects can not only be created via manual clicks in the UI, but also via the code in Git.
Git configurations are written in YAML files. These can be written manually or generated by an export via the UI. The structure of
the file tree, within the Git repository, is up to the user. XDM will parse all YAML files recursively over all directories. The user
can create a single YAML file for each object, or combine objects within one file. The dependencies between the YAML files will be
resolved by XDM before the files are applied to the XDM installation.
CasC enables the operators of an XDM installation to automatically set up a defined set of objects. Manual synchronization of several XDM installations is not necessary. Test and production instances are kept in a consistent state through the Git configuration.
| XDM will only track changes to YAML files. If other files are committed to Git, XDM will not process them. If scripts, e.g. for workflows, hooks, environments, etc. are stored in separate files and are not part of the YAML file, you need to make an additional change to the lastChangedDate field in the corresponding YAML file. Otherwise, XDM will not take this change into account. |
The following snippet shows how to enable and configure configuration as code:
casc:
enabled: true
url: <url>
user: <user>
password: <password>
secret: <secret-name>
cron: <cron>
path: <path>
branch: <branch>
owner: <owner>
- url
-
Specifies the URL of the Git repository. This repository must be accessible via HTTP or HTTPS.
- user
-
The user ID used to clone the Git repository. This setting is not required, if the
git clonecommand does not require authentication. - password
-
The password of the Git user. This setting is not required, if no user is required, or the user does not require a password for authentication.
- secret
-
If Git authentication is required, a Kubernetes secret can be used as an alternative to storing the credentials as plaintext. The name of the secret can be chosen freely. The secret must be of type
kubernetes.io/basic-auth, or it must contain the keys 'username' and password'. If the user does not require a password for authentication, the 'password' key can be empty but must exist. - cron
-
A cron expression controls the interval at which the Git repository should be checked for changes. More details of the cron syntax can be found under Scheduling.
- path
-
Specifies the directory path in the Git repository that should be monitored. This setting is optional. If the setting is not specified, XDM will monitor and apply all changes made in the Git repository. If a path is specified, XDM will only monitor and apply changes made in the specified directory.
- branch
-
Specifies the branch that should be monitored. XDM will only apply changes that are committed to that specific branch. By default, this is set to master.
- owner
-
Specifies an XDM user that will be the owner of all created objects in the XDM installation. These include all objects that are created by the Configuration as Code process. The specified username must exist in the used authentication provider. For example, if the XDM installation uses LDAP as authentication provider, the user must exist in the LDAP system.
| This feature cannot be used with an Open ID based authentication provider like KeyCloak. These systems do not allow retrieval of the group membership for a specific user. The roles are required to check whether the respective user is allowed to create or overwrite objects in the XDM installation. |
The specified user needs permissions for list and for creating all types of objects.
For more information how to create a list access permission, see the section list access permissions.
| If the created objects should be shared with other users, appropriate permission must be granted to the set of users or role. The permission(s) can be specified in the YAML file of the respective object. |
Database
XDM uses a PostgreSQL database to store information about its configuration. This includes the defined tasks, the history of task executions, your environment configuration, your user configuration, the cached object-containers, etc.
database:
extern: false
url: <url>
user: <user>
password: <password>
secret: <secret-name>
schema: <schema>
upgrade: <upgrade>
sample: <sample>
- extern
-
Controls if an internal Pod with a PostgreSQL database is started, or if an external database is used. If an external database is used, XDM will not start its own PostgreSQL database.
- url
-
Specifies the JDBC URL of the PostgreSQL database that you want to use as administration database.
This option is only required if an external database is used. If the internal database is used, the URL is set to the correct value. - user
-
Specifies the username that is used to access the database.
- password
-
Specifies the password that is used to access the database.
- secret
-
As an alternative to storing the database credentials as plaintext, a Kubernetes secret can be used. The name of the secret can be chosen freely. The secret must be of type
kubernetes.io/basic-auth, or it must contain the keys 'username' and password'. - schema
-
Specifies the schema name. This is set to public by default.
- hostname
-
Specifies the host name of the PostgreSQL database that you want to use as administration database. This option is only required if an external database is used.
This is only required if the Grafana service is used. - port
-
Specifies the port of the PostgreSQL database that you want to use as administration database. This option is only required if an external database is used.
This is only required if the Grafana service is used. - name
-
Specifies the database name of the PostgreSQL database that you want to use as administration database. This option is only required if an external database is used.
This is only required if the Grafana service is used. - upgrade
-
XDM requires tables and sequences in the specified database. This option controls if the database objects are created automatically by XDM, or if the user needs to create them manually. If the value is set to
automaticXDM will create these objects automatically. The specifieduserID must have the permissions to create tables and sequences inside the database. - sample
-
Controls if an additional PostgreSQL database is started that contains sample data. This sample database is used in the XDM tutorials. By default, the sample database is not started.
Security
security:
requiredUserRole: <required-user-role>
requiredUserRoleMessage: <required-user-role-message>
adminRole: <admin-role>
adminDefaultPermissions: <admin-default-permissions>
purchaserRole: <purchaser-role>
purchaserDefaultPermissions: <purchaser-default-permissions>
systemObjectsRole: <system-objects-role>
passwordSeed: <password-seed>
secret: <secret-name>
jwt:
expireTime: <jwt-expire-time>
validTime: <jwt-valid-time>
secret: <jwt-secret>
- required-user-role
-
Specifies a role to which the usage of XDM will be limited. This property enables the administrator to define a list of roles, which will restrict the access to XDM. In order to use XDM, an authorized user has to be a member of one of these roles. By default, all authorized users are able to use XDM.
- required-user-role-message
-
This setting controls the message that is displayed if a user tried to log in, that is not a member of the roles specified by the property xdm.required-user-role. By default, the following message will be displayed: Missing one of the following roles: <roles set in xdm.required-user-role>
- admin-role
-
Sets the names of the administrator role. The default name of the administrator role is ADMIN. When specifying multiple administrator roles, then use a comma separated list. This can be set to conform to your naming conventions. Every user who is in one or more of the roles in this list, will receive administrative privileges in XDM.
- admin-default-permissions
-
Users that have administrative privileges in XDM can read and create all XDM objects and can grant privileges to other non-administrative users. The privileges of the administrative users can be customized with this property. One or more entries of the following options can be specified:
| Privilege | Description |
|---|---|
|
Allows administrators to see objects in lists, to see details about objects, and to request a data shop order. |
|
Allows administrators to modify attributes of an object. |
|
Allows administrators to delete an object. |
|
Allows administrators to create new objects in a list. |
|
Allows administrators to execute or schedule a task or workflow template, and to place a data shop order. |
|
Allows administrators to grant permissions for an object to other users. |
|
Allows an environment or a connection to be used as the source of a task. |
|
Allows an environment or a connection to be used as the target of a task. |
|
For connections only. Allows administrators to see the contents of tables in the schema browser, and in the output of XDM tasks that provide a data preview. |
|
For connections only. Allows administrators to execute SQL statements for tables in the schema browser, and in the output of XDM tasks that provide a data preview. |
|
For task templates, tasks, workflow templates and workflows only. Allows users to see diagnostic data like stage outputs or batch reports. |
|
For credentials only. Allows the usage of this credential in a task stage hook. |
|
For data reservation only. Allows modification of a data reservation. |
- purchaser-role
-
Specifies the role, of which the members will be treated as data shop users. These users don’t need full access to all functions of XDM and will receive a customized UI with which they can more easily order test data and see the results of their orders. When specifying multiple purchaser roles, then use a comma separated list.
- purchaser-default-permissions
-
The permissions of the purchaser users can be adjusted with this property. The default setting of the property is READ and DELETE. This property is applied when a user from the purchaser role requests a data shop. The permissions set in the property are applied to the resulting execution. One or more entries of the following options can be separated by comma:
| Privilege | Description |
|---|---|
|
Allows purchaser to see objects in lists, to see details about objects, and to request a data shop order. |
|
Allows purchaser to modify attributes of an object. |
|
Allows purchaser to delete an object. |
|
Allows purchaser to grant permissions for an object to other users. |
|
For connections only. Allows purchaser to see the contents of tables in the schema browser, and in the output of XDM tasks that provide a data preview. |
|
For task templates, tasks, workflow templates and workflows only. Allows purchaser to see diagnostic data like stage outputs or batch reports. |
- system-objects-role
-
Specifies a role, that will be able to see and use the pre-defined matchers, comparable fields, and modification methods, that XDM ships with. By default, access to these pre-defined entities is restricted to the system, and only users with administrative permissions are able to see, use and change them.
Users that are a member of the specified role will be able to see and use the pre-defined matchers, comparable fields, and modification methods, however they will not be able to change them. These objects can only be changed by users with administrative permissions.
- password-seed
-
Specifies a seed that is used to encrypt passwords.
XDM stores users and passwords. These credentials are used during a task execution to authorize against source and target database systems.
By default, the passwords are not encrypted, but rather stored as plain text in the XDM admin database.
After the seed was specified, XDM will encrypt all existing passwords stored in the database after a restart of the XDM core service. All newly created passwords will be encrypted automatically.
| If the seed is changed, all previously stored passwords become invalid. |
- secret
-
As an alternative to storing the password-seed as plaintext a Kubernetes secret can be used. The name of the secret can be chosen freely but must contain the key 'passwordSeed'.
- jwt-expire-time
-
Specifies how many minutes the JWT token is valid. This token is generated by XDM, After a user has logged in. By default, the expiration time is 120 minutes.
- jwt-valid-time
-
Specifies the expiration time of the execution token. This is used to authenticate executions between core and dataflow. The default expiration time of this JWT is 1440 minutes.
- jwt-secret
-
Specifies the secret, with which the JWT token is generated and encrypted.
Security context
A securityContext in Kubernetes defines security and access control settings for a pod or container.
These settings can be used to control how the container runs, such as whether it runs as a root user, what user and group memberships it has, and what file system permissions it has.
The following represents a list of the most significant fields that may be used in a securityContext:
- runAsUser
-
Specify the user ID under which the container is running.
- runAsGroup
-
Specify the group ID under which the container is executed.
- runAsNonRoot
-
Forces the container not to be executed as the root user.
- fsGroup
-
Specify the group ID used for all files in the container’s file system.
<service>:
securityContext:
runAsUser: 1002
runAsGroup: 1000
runAsNonRoot: true
fsGroup: 2000
- service
-
The name of the service. Possible service are:
-
core
-
dataflow
-
file_sink
-
elasticsearch
-
generator_source
-
grafana
-
graph_store
-
loki
-
modification_processor
-
postgres
-
prometheus
-
sample
-
tempo
-
ui
-
webservice_apply_sink
-
webservice_extract_source
-
ai_assistance
-
Service account
In a Kubernetes environment, the service account section refers to the configuration of service accounts for a range of services within a Kubernetes cluster. A service account in Kubernetes is a specific type of account that pods use to facilitate communication with the Kubernetes API server.
In addition to the global defaultServiceAccount setting which applies to all containers / pods, the serviceAccount can be overridden for individual containers.
To specify a serviceAccount per container, use the following syntax:
<service>:
serviceAccount: "xdm-<service>-account"
- service
-
The name of the service for which the service account is configured. Possible services are:
-
core
-
dataflow
-
file_sink
-
elasticsearch
-
generator_source
-
grafana
-
graph_store
-
loki
-
modification_processor
-
postgres
-
prometheus
-
sample
-
tempo
-
ui
-
webservice_apply_sink
-
webservice_extract_source
-
ai_assistance
-
User management
Each user that intends to work with XDM needs to have a username and a password. The Authentication can either be performed by an LDAP server, OpenID provider or by using the internal user management of XDM.
User name
LDAP and OpenID authenticators provide additional user information upon successful login. This information can be used to adapt the username to a more meaningful value. By default, the 'givenName' property is used.
The following line will change the displayed username to the user’s family name:
userManagement:
fullNameProperty: family_name
Local
XDM offers a built-in user management system that allows you to maintain the usernames, passwords, and roles in a plain text file. Each line represents a separate user and must have the following format:
<user_name>;{<hash_method>}<password>;<roles>;<full_name>;<email>
- user_name
-
Specifies the name of the user.
- hash_method
-
Specifies the hash method for the password. This must be one of the following values:
-
argon2 - Argon2 password hash
-
bcrypt - BCrypt password hash
-
ldap - LDAP SHA password hash
-
MD4 - MD4 password hash
-
MD5 - MD5 password hash
-
noop - Plain text
-
SHA-1 - SHA-1 password hash
-
SHA-256 - SHA-256 password hash
- password
-
Specifies the hash sum of the password of the user. The password must be hashed with the previously specified hash method.
- roles
-
Specifies a list of roles for that user. These roles can be used later while granting permissions.
- full_name
-
The full name of the user. This property is used to identify the user in the graphical user interface. The full name is displayed in the user settings and is used to synchronize the display name of a permission recipient.
This field is optional, but required if the e-mail is to be specified.
-
The e-mail address of the user. The e-mail address can be accessed in the various Java Scripts / Groovy scripts.
This field is optional, but required if the full name of the user is to be specified.
The following snippet shows how to specify the users in the Helm chart:
userManagement:
local: |
# User name ; Password ; Roles
admin;{noop}default;ADMIN
user;{noop}default;USER
tech;{noop}default;ADMIN
LDAP
XDM can use an LDAP server to authenticate users. The LDAP server controls which users exist and how they have to authenticate. The LDAP server also controls what roles a user is a member of.
userManagement:
ldap:
enabled: true
url: <ldap-server-url>
searchFilter: <search-filter>
searchBase: <search-base>
group:
searchBase: <group-search-base>
searchFilter: <group-search-filter>
manager:
user: <manager-user>
password: <manager-password>
secret: <secret-name>
- ldap-server-url
-
Set <hostname> and <port> to the host name and port of your LDAP server. Set <base_db> to the distinguished name of the entry that is starting point of the search. You can use the
ldapsprotocol instead ofldap. Example:ldap://ldap.mycompany.com:389/dc=mycompany,dc=com - search-filter
-
The search filter for the users. Example:
uid={0} - search-base
-
The LDAP directory from which each search will start. If left empty, searches can take longer. Typically,
cn=Usersis a good value forsearch_base. - group-search-base
-
Defines the part of the directory tree, under which group searches should be performed.
- group-search-filter
-
The filter that is used to search for group membership. The default is
uniqueMember={0}. - manager-user
-
The username for the LDAP server, if it requires a login
- manager-password
-
The password for the LDAP server, if it requires a login.
- manager-secret
-
If the LDAP server requires a login, a Kubernetes secret can be used as an alternative to storing the login credentials as plaintext. The name of the secret can be chosen freely but must contain the keys 'username' and 'password'.
OAuth2
XDM supports the user authentication with an external OpenID Connect provider. XDM forwards login requests to the configured OpenID Connect provider and the user needs to log in at that system. After a successful login the OpenID system redirects to XDM. These settings must be configured with options that are described in this section.
As a prerequisite for using an OpenID provider with XDM, some information must be set and defined in the provider. Here the corresponding client to be used for the authentication within XDM must be defined and configured.
Parallel to the OpenID Connect authentication, the internal user management
or LDAP authentication can also be used. When only OpenID Connect authentication should
be used, it is possible to deactivate the internal user management by setting the
internal user management property file.user to an empty value.
|
userManagement:
oauth2:
registration:
<name>:
client-id: <client-id>
client-secret: <client-secret>
client-name: <client-name>
client-authentication-method: <client-authentication-method>
authorization-grant-type: <authorization-grant-type>
redirect-uri: <redirect-uri>
scope: <scope>
provider:
<name>:
authorization-uri: <authorization-uri>
token-uri: <token-uri>
jwk-set-uri: <jwk-set-uri>
user-info-uri: <user-info-uri>
user-info-authentication-method: <user-info-authentication-method>
userNameAttribute: <userNameAttribute>
issuer-uri: <issuer-uri>
- name
-
Specifies the name of the provider, e.g.
keycloak,okta,google, etc. - client-id
-
The ID that uniquely identifies the client.
- client-secret
-
Client specific secret. If not specified, it’s supposed to be a public OpenID Connect provider.
- client-name
-
A descriptive name used for the client. The name is displayed in the login page.
- client-authentication-method
-
The authentication method used when authenticating the client with the authorization server. Valid values are:
-
client_secret_basic
-
client_secret_jwt
-
client_secret_post
-
none
-
private_key_jwt
-
- authorization-grant-type
-
Authorization grant type specifies the method that the client uses to obtain an access token. Valid values are:
-
authorization_code
-
client_credentials
-
jwt_bearer
-
password
-
refresh_token
-
- redirect-uri
-
The client’s registered redirect URI that the authorization server redirects the end-user’s user-agent to, after the end-user has authenticated and authorized access to the client. Typically, this is set to
<base-url>/api/login/oauth2/code/<provider-name>where<base-url>is the base URL of your XDM installation. - scope
-
Sets the scope used for the client. Must be
openidfor the OpenID Connect protocol. - authorization-uri
-
The URI for the authorization endpoint.
- token-uri
-
The URI for the token endpoint.
- jwk-set-uri
-
The URI for the JSON Web Key (JWK) Set endpoint.
- user-info-uri
-
The URI for the user info endpoint which provides additional details about the user.
- user-info-authentication-method
-
The authentication method used when sending bearer access tokens in resource requests to resource servers. Valid values are:
-
FORM
-
HEADER
-
QUERY
-
- userNameAttribute
-
The name of the attribute returned by the ID-token or by the UserInfo response that references the name or identifier of the end-user. Common values are:
name,preferred_username,given_nameorfamily_name. - issuer-uri
-
The URI used to initially configure a ClientRegistration using discovery of an OpenID Connect provider’s configuration endpoint.
Persistence
persistence:
<volume>:
storageClass: <storage-class>
annotations: <annotations>
labels: <labels>
accessMode: <access-mode>
size: <size>
existingClaim: <existing-claim>
- volume
-
The name of the volume. Possible volumes are:
-
data: Stores the XDM task working directories and task execution outputs like logs and reports.
-
postgres: Stores the data of the integrated PostgreSQL database. This volume is not necessary if an external PostgreSQL database is used.
-
sample: The data of the sample database. This volume does not exist if the sample database is disabled.
-
grafana: Stores the data of the Grafana service. This volume is only applicable if the Grafana service is enabled.
-
elasticsearch: Stores data of the Elasticsearch service. Elasticsearch is used to search through the XDM configuration objects. The volume is only required, if the global search is activated.
-
- storage-class
-
The name of the Kubernetes storage class that is used for the persistent volume claim. The storage class setting will be set to an empty value if - is specified.
- annotations
-
Annotations that are specified in the
.metadata.annotationssection of the persistent volume claim deployment. - labels
-
Labels that are specified in the
.metadata.labelssection of the persistent volume claim deployment. - accessMode
-
The access mode of the volume claim. For most pods, a value of
ReadWriteOnceis sufficient. However, the dataflow and core pods need to share their folders. In this case a value ofReadWriteManyis recommended. If the cloud does not support this access mode you can useReadWriteOnce, but you need to ensure that the dataflow and core pod are running on the same node. One way to enforce this is to use the KubernetespodAffinitysetting. - size
-
The size of the persistent volume.
- existingClaim
-
Specify an existing volume claim. If an existing volume claim is used, XDM will not create a separate volume claim.
Ingress
The Helm Chart includes an ingress deployment to access the XDM UI.
ingress:
enabled: true
domain: <domain>
annotations: <annotations>
tls: <tls>
ingressClassName: <ingressClassName>
- domain
-
Specifies the domain with which to access the XDM web UI. The Helm Chart creates the ingress deployment of this domain. To enable TLS/SSL protection, you need to specify the
tlssection to specify the configuration and secret name. - annotations
-
Annotations are specified on the ingress deployment. The following example illustrates how to use the annotation block. For example, a basic authentication dialog will be used for the XDM installation:
ingress: annotations: nginx.ingress.kubernetes.io/auth-type: basic nginx.ingress.kubernetes.io/auth-secret: basic-auth nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required - foo' - tls
-
Specifies the
tlsblock of the ingress deployment. This section is necessary if access to XDM should be secured withhttps. In this section you need to set up the TLS configuration and the TLS/SSL certificate.ingress: tls: - hosts: - xdm.eample.com secretName: xdm-tls - ingressClassName
-
Specifies the class name that refers to the respective ingress controller. This field is available for Kubernetes 1.18 or higher. In the following example we specify that the nginx controller should be used.
ingress: ingressClassName: nginxFor using this with Kubernetes before version 1.18 the annotation section should be used.
ingress: annotations: kubernetes.io/ingress.class: nginx
Prime UI
The Prime UI can be deployed in parallel with the current web UI and is accessed under the /prime/ path.
To enable it, activate the Prime UI pod.
The following example shows the required settings in the values.yaml file:
prime-ui:
enabled: true
userManagement:
oauth2:
registration:
<name>:
redirect-uri: <redirect-uri>/api/login/oauth2/code/<provider>
If you use OpenID authentication, only one redirect URI can be stored.
Therefore, logins initiated from Prime UI will redirect to the current web UI.
After login, you can switch back to /prime/ in the browser.
|
Ensure the redirect-uri includes the full path, including everything from /api/… onward. Add /api/login/oauth2/code/<provider> and replace <provider> with the actual provider name, so the final redirect URL is complete.
|
AI Assistance
The AI Assistance service provides support to users by writing scripts like modification methods in XDM, by using a LLM.
ai_assistance:
enabled: true
- enabled
-
Controls if an own pod is started with the AI Assistance service.
To use the AI assistance feature of XDM, the LLM must be configured in the values.yml file.
More information on how to configure the AI assistance service can be found here.
Kafka Source and Kafka Sink
Kafka is an open source event streaming platform. Since Kafka is a distributed system, there is at least one server. This server is called broker if it is part of the storage layer. The main clients, or client APIs, are divided into producers, consumers and admins. XDM’s Kafka Source uses the consumer API to read events from specific topics, and the Kafka Sink can write events to a Kafka topic.
Kafka Source and Kafka Sink can be individually configured, but they use the same configuration options.
kafka_source:
enabled: true
certificateBinding:
enabled: true
kafka_sink:
enabled: ...
- enabled
-
Controls if the service is started.
- certificateBinding
-
Value map with the following entries:
- enabled
-
When enabled, the certificates specified in
microservices.certificatesin the values.yaml are mounted and imported into the Kafka Services.
When using certificates you need to specify them as secrets as shown below. This configuration works the same way like in General settings.
microservices:
certificates:
- <certificate-secret-name-1>
- <certificate-secret-name-2>
- ...
Grafana
Grafana is a cross-platform open source application for graphical representation of data from various data sources such as PostgreSQL or Prometheus.
The Grafana service uses tables of XDM’s admin database as a data source.
grafana:
enabled: true
jsonData:
[...]
- enabled
-
Controls if an own pod is started with Grafana. This is not necessary if no graphical display of statistic data is required.
- jsonData
-
Optional parameter to define own jsonData for the built-in data source of the administration database. More details how to configure a PostgresQL data source can be found here.
If jsonData is specified, the database entry must be included within the jsonData block.
If the database entry is missing, the Grafana Query Builder will not function properly.
In particular, it will not display any tables or columns in the corresponding drop down list.
|
Elasticsearch
Elasticsearch is a search engine that is used to search through XDM configuration objects. XDM uses this service to indexes all configuration objects. If the service is enabled a search field is displayed in the XDM UI. You can instantly search through XDM configuration objects and task execution logs.
elasticsearch:
enabled: true
password: xdm
- enabled
-
Controls if an own pod is started with the Elasticsearch service.
- password
-
Specifies the password for the built-in elastic search user.
Prometheus
Prometheus is a data/metric collector service. It stores metric values over time that can later be visualized by the reporting service of XDM3. Prometheus collects metrics from the java virtual machine (JVM) and from the Spring services that is running the XDM3 core and dataflow server.
prometheus:
enabled: true
- enabled
-
Controls if an own pod is started with Prometheus.
Task execution platforms
XDM allows the definition of different execution platforms.
An own Kubernetes Pod is started for each task execution.
The system settings of the Pod are based on the platform settings.
XDM defines a default platform.
The settings of the default platform can be customized as seen below.
To add a new platform add a new sub-entry to the deployer.platform setting.
| The platform name must not start with a number or contain special characters. |
deployer:
platform:
default:
limits:
cpu: 500m
memory: 1024Mi
requests:
cpu: 500m
memory: 1024Mi
The following example adds the new platform example. Each task execution that runs on that platform can use up to 4GB main storage.
deployer:
platform:
example:
limits:
memory: 4096Mi
requests:
memory: 4096Mi
Storage
Storage settings specify where to store iceboxes and task directories. XDM supports storing in a local storage or cloud storage. By default, XDM stores the files on the local file system. XDM also supports several cloud storage systems:
-
Amazon AWS S3 compatible storage
-
Microsoft Azure blob store
-
Google Cloud buckets
To enable a cloud storage you must specify one of the options described in the following sections.
Amazon AWS S3 storage
To store the data in Amazon AWS S3 compatible storage, the following settings are available. To enable this type of storage you must specify a value of s3 for the type attribute.
storage:
type: "s3"
s3:
bucket: "xdm"
region: "eu-central-1"
endpoint: "s3.eu-central-1.amazonaws.com"
credentials:
accessKey: "AKIAIOSFODNN7EXAMPLE"
secretKey: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
- bucket
-
A S3 bucket is an object container. Objects can’t exist without a bucket. For details about buckets refer to Amazon’s documentation: Bucket overview and naming rules for buckets.
- region
-
Region specifies the geographic location where your data will be stored. See regions and endpoints in the AWS General Reference.
- endpoint
-
Together with the previous two properties, this property specifies the S3 API endpoint for object manipulation/access. See regions and endpoints in the AWS General Reference.
- accessKey
-
An
accessKeyis like a username and is used for authentication of requests. See Managing access keys for IAM users. - secretKey
-
secretKeyis like a password and is used together withaccessKeyto authenticate requests. See Managing access keys for IAM users.
Microsoft Azure blob store
To store the data in a Microsoft Azure blob store, the following settings are available. To enable this type of storage you must specify a value of azure for the type attribute.
Permissions
XDM requires several permissions to access the Azure blob service. The required privileges and resources are listed below. They must be specified in the Shared Access Signature (SAS) token creation dialog:
| Type | |
|---|---|
Allowed services |
Blob |
Allowed resource types |
Container, Object |
Allowed permission |
Read, Write, Delete, Add, Create |
XDM will store the data in blob objects stored in a container. It needs access to the container and object resource types.
storage:
type: "azure"
azure:
containerName: "xdm"
endpoint: "https://xdmtest1.blob.core.windows/..."
tokenSecret: "my-secret"
- containerName
-
The name of the Azure container where the blobs will be stored. XDM will create the container if it does not already exist.
- endpoint
-
The SAS URL to access the blob service.
- tokenSecret
-
As an alternative to a token a Kubernetes secret can be used. The name of the secret can be chosen freely. The secret must contain a key named token that contains the SAS token.
Google Cloud Bucket
To store the data in a Google Cloud Platform compatible storage, the following settings are available. To enable this type of storage you must specify a value of gcp for the type attribute.
storage:
type: "gcp"
gcp:
bucketName: "sample-bucket"
projectId: ""
keySecret: "my-secret"
- bucketName
-
The name of the bucket that stores the data. The bucket will be created if it does not exist, otherwise the existing bucket is used to store files in it.
- projectId
-
The ID of the respective Google Cloud project.
- keySecret
-
A secret name that contains the access key of the service account in JSON format. The content of the secret must be stored with data.json.
LLM configuration
XDM can use a Large Language Model (LLM) for different purposes. The LLM configuration is used to specify the LLM provider that should be used.
llm:
spring.ai.openai.base-url: <base-url>
spring.ai.openai.api-key: <api-key-secret>
spring.ai.openai.organization-id: <organization-id>
spring.ai.openai.project-id: <project-id>
spring.ai.openai.chat.options.model: <model>
- base-url
-
Optional override for the spring.ai.openai.base-url to provide a chat-specific base URL.
- api-key-secret
-
Specifies the API key secret that is used to authenticate against the LLM provider. The secret must provide a key named data, that contains the API key.
- organization-id
-
Optionally, you can specify which organization to use for an API request.
- project-id
-
Optionally, you can specify which project to use for an API request.
- model
-
Optionally, you can specify which model to use for an API request. By default gpt-4o-mini is used.