In addition to StreamSets Data Collector‘s support for a variety of Credential Stores, one can also use Kubernetes Secrets as a mechanism for securely managing credentials and environment-specific properties within Data Collector pipelines.
This post shows an example of using a Kubernetes Secret to manage MySQL connection properties for a JDBC Producer.
Create a Pipeline that Connects to MySQL
Consider a simple pipeline that writes generated data to MySQL using a JDBC Producer, like this:
The JDBC Producer needs values to be set for its MySQL url, username and password properties. Start by storing those properties in a Kubernetes Secret.
Store MySQL Connection Properties in a Kubernetes Secret
Create a Kubernetes Secret named “mysql-secret” with the connection properties needed for your environment, like this:
$ kubectl create secret generic mysql-secret \
The Secret’s namespace must match the namespace SDCs are deployed in.
Create an SDC Deployment that mounts the Secret
StreamSets Control Hub can deploy StreamSets Data Collector (SDC) instances on Kubernetes using a Provisioning Agent. Here is an example Deployment spec for a Control Hub-managed SDC Deployment that includes a Volume and Volume Mount for the Secret:
apiVersion: apps/v1 kind: Deployment metadata: name: sdc-jdbc namespace: ns1 spec: replicas: 1 selector: matchLabels: app: sdc-jdbc template: metadata: labels: app: sdc-jdbc spec: containers: - name: datacollector image: <your-repo>/<your-sdc-image>:<version> imagePullPolicy: Always ports: - containerPort: 18630 env: - name: HOST valueFrom: fieldRef: fieldPath: status.podIP - name: PORT0 value: "18630" volumeMounts: - name: mysql-secret mountPath: /resources/mysql volumes: - name: mysql-secret secret: secretName: mysql-secret
Start the Deployment
Start the Deployment and wait for the new SDC(s) to register with Control Hub
Use the Secret in a Pipeline
The Secret, with keys “url”, “username” and “password”, is mounted to the path
/resources/mysql within each SDC container. Property values, such as the MySQL URL, can be retrieved from the Secret using the
runtime.loadResource() function like this:
Similarly, the MySQL credentials can be retrieved like this:
Create and Run a Job
Once the pipeline is configured, create and run a Job on the deployed SDC and confirm it successfully writes data to MySQL:
Use Environment-specific Secrets
The same pipeline can be deployed in other namespaces or clusters, with different values stored in their mysql-secret. With that in place, no changes in the pipeline configuration are needed to ensure the pipeline will connect to and sync your MySQL instance specific to the environment it is running in.
Try it out!
StreamSets Control Hub makes it easy to deploy SDC in Kubernetes. As this post shows, you can also easily take advantage of Kubernetes Secrets within your pipelines!