A few months ago I started looking at Delphix for using it for data masking between our production and Development/QA environments.
The more I learned about Delphix, the more I realized the kind of impact it can have for Development/QA in addition to data masking.
These are some of the common tasks we do in Development/QA that I felt Delphix might be able to help with:
- Faster Data Refreshes – Maintain versioned compressed copies of prod database that can be refreshed to Development/QA in minutes. For some applications we get production refreshes weekly, sometimes even daily. This usually involves sending a request to DBAs, highlighting priority and waiting for them to refresh production data to the Development/QA/UAT environment. This can take anywhere between half a day to three days.
- Multiple Business Date Data – For some applications multiple processes may need different business date to test completely and depending on the design of the application this is not easily doable without a lot of automated/manual setup by Development/QA folks to set up the environment. With Delphix’s ability to backup/restore copies in minutes in a self serve manner – this testing of different dates becomes much simpler and less error prone.
- Enhanced Debugging – Make a copy of the current state of your database in Development/QA on demand – In some applications when an error happens having a copy of the state of the database is helpful for developers for debugging. For such applications we usually halt testing and wait till the issue is debugged.
- Efficient Test Data Management – Many of our application testing is pre and post comparison. When you don’t have access to multiple environments – this means doing table backups. Some times QA and UAT are done in a shared environment. Again this requires doing table backups – error prone.
With the help of our infrastructure teams we setup Delphix in QA for one of our applications. We setup Delphix to solve a very specific problem and had some interesting revelations. Sharing our experiences below.
What is Delphix ?
Delphix is a product that runs as a virtual machine, connects to a data source (Production/Development/QA),compresses it, and maintains an updated, versioned copy of the source.
A simple video on what Delphix is: https://www.delphix.com/products/data-virtualization-engine/how-data-virtualization-works
As you can see in Diagram 1 – Every Delphix setup needs a source (sometimes called Dsource ) – This is the database that will be copied and the copies of the deltas maintained as versions in a repository. Your source can be a production or a development database.
The Delphix copy of your source takes a similar amount of time as a normal full database backup. This is an important point to consider for applications that runs almost 24/7 in production. Delphix copy is supposed to be non-intrusive ( minimum impact to the performance of your application when a Delphix copy is being made as the backup reads data from disk directly not from the memory. There will be little I/O contention for loaded system). For larger database, the database backup will take a long time. It is strongly recommended that you create a copy of the database over the weekend or on demand.
If Delphix does not support the backup mechanism that your company is using, Delphix has to run native copy only backup. So each delta copy of the database will be a full copy.
For Example: For us, the backup of 1.4 terabytes database took 7-9 hours. An important factor to consider when setting up Delphix in prod. Because of this it is important to understand how often you need a copy made – daily, weekly monthly etc. or on demand.
Target Server DB
An application can not directly use a Delphix copy – the compressed copy needs to be restored to a virtual or physical database on a server. You can use a VDB for your target DB.
Please test your application with a Delphix copy/snapshot process to ensure it works without issues – that it will not cause data corruption. We did a pre and a post comparison. Our pre was a full regression run using our original DB and our post is a regression run using a Delphix copy (This is the copy that was restored to our target VDB).
Our regression results were identical between the two sets of runs. Which gave me confidence in the product.
We did notice a difference in performance. Our application batch run with the original DB used to take 15 minutes. Using the Delphix copy restored target database, the application batch now took 36 minutes. Keep in mind there are differences between the original DB hardware specifications and the virtual target server specifications. Differences in the number of CPU, memory etc. For this specific application the twenty one minute increase in regression was acceptable. We do have other areas where such an increase in application run time will not be acceptable.
For performance critical applications I recommend working with the your infrastructure teams to obtain a target server with a similar configuration as your source server to minimize performance impact.
An interesting read I came across when reading about Delphix performance:
Jet Stream is a framework provided by Delphix for self-service data management. Once a source and a target is setup in Jetstream and the correct permissions setup, using a web interface we were able to make backups and do restores with few clicks. The web interface for Jetstream is very user friendly. Our data refreshes took less than 15 minutes – the original size for this database is around 1.4 terabyte. For someone whose life revolves around various data refreshes for different applications this was amazing to see.
Next steps for us
- We are looking to set up Delphix for an application that uses Sybase databases.