skip to Main Content

Using Kubernetes Secrets in Data Collector Pipelines

By Posted in Data Integration May 19, 2020

In addition to StreamSets Data Collector‘s support for a variety of Credential Stores, one can also use Kubernetes Secrets as a mechanism for securely managing credentials and environment-specific properties within Data Collector pipelines.

This post shows an example of using a Kubernetes Secret to manage MySQL connection properties for a JDBC Producer.

Create a Pipeline that Connects to MySQL

Consider a simple pipeline that writes generated data to MySQL using a JDBC Producer, like this:

The JDBC Producer needs values to be set for its MySQL url, username and password properties. Start by storing those properties in a Kubernetes Secret.

Store MySQL Connection Properties in a Kubernetes Secret

Create a Kubernetes Secret named “mysql-secret” with the connection properties needed for your environment, like this:

$ kubectl create secret generic mysql-secret \
 --from-literal=url='jdbc:mysql://10.10.185.18:3306/db1' \
 --from-literal=username='dev-user' \
 --from-literal=password='ksKDSNs9kjs!a'

The Secret’s namespace must match the namespace SDCs are deployed in.

Create an SDC Deployment that mounts the Secret

StreamSets Control Hub can deploy StreamSets Data Collector (SDC) instances on Kubernetes using a Provisioning Agent. Here is an example Deployment spec for a Control Hub-managed SDC Deployment that includes a Volume and Volume Mount for the Secret:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sdc-jdbc
  namespace: ns1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sdc-jdbc
  template:
    metadata:
      labels:
        app: sdc-jdbc
    spec:
      containers:
      - name: datacollector
        image: <your-repo>/<your-sdc-image>:<version>
        imagePullPolicy: Always
        ports:
        - containerPort: 18630
        env:
        - name: HOST
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: PORT0
          value: "18630"
        volumeMounts:
        - name: mysql-secret
          mountPath: /resources/mysql
      volumes:
        - name: mysql-secret
          secret:
            secretName: mysql-secret

Start the Deployment

Start the Deployment and wait for the new SDC(s) to register with Control Hub 

Use the Secret in a Pipeline

The Secret, with keys “url”, “username” and “password”, is mounted to the path /resources/mysql within each SDC container.  Property values, such as the MySQL URL, can be retrieved from the Secret using the runtime.loadResource() function like this:

Similarly, the MySQL credentials can be retrieved like this:

 

Create and Run a Job

Once the pipeline is configured, create and run a Job on the deployed SDC and confirm it successfully writes data to MySQL:

 

Use Environment-specific Secrets

The same pipeline can be deployed in other namespaces or clusters, with different values stored in their mysql-secret. With that in place, no changes in the pipeline configuration are needed to ensure the pipeline will connect to and sync your MySQL instance specific to the environment it is running in.

 

Try it out!

StreamSets Control Hub makes it easy to deploy SDC in Kubernetes.   As this post shows, you can also easily take advantage of Kubernetes Secrets within your pipelines!

Learn how to create and manage SDC deployments on Azure and Google Cloud Platform.

Conduct Data Ingestion and Transformations In One Place

Deploy across hybrid and multi-cloud
Schedule a Demo
Back To Top