Who's online

There are currently 0 users and 33 guests online.

Recent comments

Data Virtualization for Databases Use Cases, part 1

followup at: part 2

It’s time that businesses took a good, hard look at the way they manage their cloned database environments. For years the words “we need a fresh copy of production in QA” were the universal sign of a horrible day. The amount of storage, server resources, and time required for provisioning a new database environment were often times unacceptable or at the least inconvenient.

Thanks to Virtual Machines, the server resources have become far easier to provision. Within minutes a completely available virtual server could be ready to go, and often disk resources could be provisioned along with the virtual machine. But still a clone of the database has to be made. The clone operation will require a significant amount of storage for the new database, the intermediary backup taken to create the clone (in some cases), and archive logs for bringing the cloned database up to the proper point in time. But even worse, the clone will take time.

Developers, project managers, even business analysts discuss the requirements for new deployments. Meetings take place with DBA teams and other operational resources; decisions are made about how to proceed with QA and DEV resources. And no matter what brand of fancy snapshotting or cloning technology your storage solution might offer, no matter what kind of virtual machine environment you have, time has to be taken and resources expended to make the clone happen.

This situation happens often at many organizations, sometimes as much as once per week. Disk space and time are wasted, making it more difficult to resolve critical issues or bring new development to market.


Enter Database Virtualization


Database virtualization completely changes the way database clones are created and revolutionizes the way a business perceives new environment provisioning. It does this not by virtualizing servers in the way popular options like VMWare ESX and Oracle VM do, but by virtualizing the very data that you are accessing.

An Oracle environment is made up of two parts: the Instance and the Database. The instance is formed of the running processes that make up Oracle’s software stack and segments of data in RAM. The database is nothing more than files on disk. Datafiles, redo logs, and control files are just blocks of data written to disk which serve no real purpose without an Oracle instance to access them.

In a traditional environment, cloning a database requires a new Oracle instance and a complete copy of the source database. This must occur for every clone your application needs. In a typical application stack, you will have databases for production, development, and quality assurance (QA). In some environments you might add user acceptance testing (UAT), regression testing, and reporting as well. Each of these environments will need its own instance and copy of the database.

Cloning with a Virtualization Appliance

The virtualized database, on the other hand, requires only one actual copy of the source database. The copy must only be taken one time to the virtualization server via RMAN APIs:

After the database has been cloned a single time into the virtualization appliance, it will be kept up to date with either redo/archive logs or level 1 incremental backups. Redo can be pulled from the source database near real-time to keep the virtualization appliance as close as possible to the source. If a more staggered clone is required, archive logs or level 1 backups can provide incremental updates on whatever schedule you need. We will get back to the change data later.

With database virtualization, that single copy of the source database can be used to provide data for multiple cloned environments:

The disk savings in this configuration alone are tremendous because a new copy of the production database is not required for every single system you create. In the case of a 10TB database with Development, QA, and UAT clones you would save roughly 50% of the total disk space (originally 40 TB  with production 10TB + QA 10TB + Dev 10TB + UAT 10 TB down to  just 20TB with production 10TB + one clone 10TB). But beyond this, another benefit of database virtualization shrinks the requirement even more with ZFS or DxFS compression.

In the case of the 10TB database with three clones, the overall size would be somewhere around 13.3TB (production 10TB + one clone 3.3TB). Each instance you provision can attach to the compressed virtualized data via NFS and perform normal read/write operations like a full database copy without the extra overhead.

Even more impressive benefits

Screen Shot 2013-04-18 at 9.59.03 AM

While the disk savings alone are tremendous, the real benefits come in the form of time. Since the source data is already copied and all that is required is an instance to mount the virtual database files, you can create a full sized read/write clone in about 5 minutes.

Your target can be any Unix or Linux system, physical or virtual, that is linked to the virtualization appliance and has Oracle installed. With a few mouse clicks on the administration UI a clone can be completely provisioned from start to finish for near-instant access.

As noted in the previous section the virtualization appliance also takes change data in the form of redo/archive logs or level 1 backups from the production environment. Unlike a standby database where this data is consumed in order to have a single clone kept up to date, the change data in the data virtualization environment is kept for a user-specified retention period so clones can be provisioned from any point in time. Each clone can come from a different time, so you can have multiple clones of the same source database from different time windows with no additional overhead.

Lastly, there is a strong performance benefit when using a virtualization appliance thanks to shared block caching between cloned environments. This performance benefit is described in detail by a joint performance study between Delphix and IBM called A High Performance Architecture for Virtual Databases.

Continued at: Data Virtualization for Databases Use Cases, part 2