This solution describes how to convert Avro files to the columnar format, Parquet.
This solution describes how to configure a Drift Synchronization Solution for Hive pipeline to automatically refresh the Impala metadata cache each time changes occur in the Hive metastore.
This solution describes how to design a pipeline that writes output files to a destination, moves the files to a different location, and then changes the permissions for the files.
This solution describes how to design a pipeline that stops automatically after it finishes processing all available data.
This solution describes how to offload data from relational database tables to Hadoop.
This solution describes how to design a pipeline to send email notifications at different moments during pipeline processing.
This solution describes how to design a pipeline that preserves an audit trail of pipeline and stage events that occur.
You can use several solutions to load data into a Delta Lake table on Databricks.
The Drift Synchronization Solution for Hive detects drift in incoming data and updates corresponding Hive tables.
The Drift Synchronization Solution for PostgreSQL detects drift in incoming data and automatically creates or alters corresponding PostgreSQL tables as needed before the data is written.