I recently had the immense pleasure of visiting Cisco’s labs at Bedfont Lakes for a day of intensive information exchange about their UCS offering. To summarise the day: I was impressed. Even more so by the fact that there is more to come, I’m assuming a few more blogs posts about UCS will get published here after I had some time to benchmark it.
I knew about UCS from a presentation at the UKOUG user group, but it didn’t occur at the time which potential is behind the technology. This potential is something Cisco sadly fail to make clear on their website-which is very good once you understand the UCS concept as it gives you many details about the individual components.
I should stress that I am not paid or otherwise financially motivated to write this article, it’s pure interest in technology that made me write this blog post. A piece of good technology should be mentioned, and this is what I would like to do.
What is the UCS anyway?
When I mentioned to friends that I was going to see Cisco to have a look at their blade server offering I got strange looks. Indeed, Cisco hasn’t been known as a manufacturer of blades before, it’s only recently (in industry terms) that they entered the market. However instead of providing YABE (yet another blade enclosure), they engineered it quite nicely.
If you like, the UCS is an appliance-like environment you can use for all sorts of workloads. It can be fitted in a standard 42” Rack and currently consists of these components (brackets contain product designations for further reading):
The Fabric Interconnects can take extension modules with Fibre Channel to link to a FC switch, there is no new technology introduced and existing arrays can be used. Also, existing fibre channel solutions can be used for backups.
Another of the interesting features is the management software, called UCS Manager. It’s integrated into the Fabric Interconnect using a few gigabyte of flash storage. Not only is it used to manage a huge number of blades, it can also stage firmware for each component. At a suitable time, the firmware can be upgraded in a rolling fashion except for the Fabric Interconnect (obviously), though the fabric interconnects can take advantage of the clustering functionality to ensure that complete firmware upgrades can be undertaken with a system-wide outage.
Fibre Channel over Ethernet
What I was very keen to learn about was the adoption of FCoE in UCS. Ever since it has been released, the UCS uses FCoE for storage inside the system. I can image that this must have been difficult to sell, since FCoE was a very young standard at the time, and still probably is.
For those of you who don’t know FCoE, it’s broadly speaking FC payloads in Ethernet frames. Since Ethernet was never designed to work like Fibre Channel, certain amendments had to be made to the 802.x standards. The so-modified Ethernet is often referred to as Data Centre Ethernet (DCE) or Converged Enhanced Ethernet (CEE). In a way, FCoE competes with established Fibre Channel and emerging ones such as iSCSI or even SRP for the future storage solution. History has shown that Ethernet is very resilient and versatile, it might well win the battle for unified connection-if implemented correctly. And by unified I mean network and storage traffic. I was told that the next generation UCS will not have dedicated Fibre Channel ports in the Fabric switches, all ports are unified. All you need is a small SFP to attach a fibre cable or 10G Ethernet.
The fabric Interconnects in the current version use traditional but aggregated 8G/s Fibre Channel to go to the storage.
UCS introduces the idea of a service profile. This is probably the biggest differentiator between it and other blade solutions. A blade in the enclosure can take any role and configuration you assign to it. It took me a little while to understand this, but an analogy helped: think of a blade as something configurable similar to a VM: before you can put something on it, you first have to define it. Amongst the things you set are boot order (SAN boot is highly recommended, we’ll see why shortly), which VSAN to use, which VNICs to use in which VLAN etc. Instead of having to provide the same information over and over again, it’s possible to define pools and templates to draw this information from.
Technicalities set aside, once you define a service profile (let’s assume a RAC node for example), you assign this profile to a blade that’s in the enclosure. A few seconds later, you’ll see the blade boot from the storage array and you are done. If the SAN LUNs don’t contain a bootable operating system, you can us eth KVM to create one.
Another nice thing I think is the use of 10G Ethernet throughout. The two switches do not operate in Spanning Tree Mode, which would limit the uplink speed to 10G (one path)
There is obviously more, but I think this blog post has become longer than it should be. I might blog more about the system at a later stage, but not after going to add this final section:
The question that immediately springs to mind is: how does it compare to Exadata? Is it Exadata competition? Well, probably no. UCS is a blade system but it doesn’t feature Infiniband or zero copy/iDB protocol. It doesn’t come with its own more or less directly attached storage subsystem. It can’t do smart scans, create storage indexes, or do other cell offloading. It can’t do EHCC: all these are exclusive to Exadata.
This can be either good or bad for you, and the answer is of course “it depends”. If your workload is highly geared towards DSS and warehousing in general, and you have the requirement to go through TB of data quickly, then Exadata probably is the answer.
If you are in need of consolidating say your SPARC hardware on x86, a trend I see a lot, then you may not need Exadata, and in fact you might be better off waiting for Exadata to mature some more if you are really after it. UCS scores many points by not breaking completely with traditional data centre operations: you still use a storage array you connect the blades to. This makes provisioning of a test database as simple as cloning the production LUNs. Admittedly you get a lot of throughput from IPoIB as Exadata uses it, but I would doubt that an RMAN duplicate from active database is faster than creating clone on the storage array. UCS also allows you to use storage level replication such as SRDF of Continuous Access (or whichever name other vendor give it)
In summary, UCS is a well-engineered blade system using many interesting technologies. I am especially looking forward to FCoE multi-hop, which should be available with UCS 2. Imagine the I/O bandwidth you could get with all these aggregated links!
Am I an expert on UCS? Nope, not by a long. So it could be that certain things described in this blog post might be inaccurate. If you like, use the comment option below to make me aware of these!