I really enjoy being a technical guy. So far in my career I’ve made development choices favoring a technical path over other options. It’s been a great ride – I’ve worked in small teams and large teams; consulting roles and in-house roles; architecture/engineering roles and operations roles; big databases and little databases; environments with a few databases and environments with thousands of databases.
I’ve never been anywhere which had everything right. Also, my own ideas about what’s “right” are still evolving today. I have a habit of trying to be around people who are smarter than me… over the course of my career, my own knowledge and experience with Oracle have grown exponentially and yet I’ve never had trouble continuing to feel like a junior DBA. (In particular, my invitation to the Oak Table made this very easy!)
With that long introduction, here’s what I’m up to now: I’m going to write a series of articles about Operationally Scalable Practices for Oracle. I’ll use my train commutes and lunch breaks to write a collection of fairly specific “good ideas” that come to mind for managing and operating Oracle Databases. I hope that this stimulates interest and discussion from my peers who face similar challenges. And I hope that it will become a place I can return to later to see how my ideas evolve over time. Or even better – if I get some interesting feedback then it will become a place for collaboratively compiling the best ideas.
I’m drawing a distinction between operations and development activities, and my focus here will be on operations. Think of all the decisions you make when you’re building new systems, for example: how to name things, what directory structure to use, and values for OS and DB configuration parameters. There’s an abundance of resources for the development side of the house (designing and troubleshooting applications or SQL) – like material from Kyte, Lewis, Feuerstein, etc. However I haven’t seen many solid and comprehensive books for operations. As a result, there’s very little shared wisdom for writing organizational processes and standards about naming databases or setting init params. There is some incredibly deep expertise in the marketplace – but right now that knowledge is generally locked within individuals’ memory and organizations’ private powerpoints. Or else it’s available piecemeal on various blogs while there’s no comprehensive compilation. This is a significant gap which hurts younger organizations and newer DBAs. I hope these articles begin to address the need.
This topic will be relevant to any organization that spends time and resources on database operations – not just the big companies. I think you should be reading closely and heavily engaging these topics if – once a month or more – someone in your organization does any of the following activities:
Let me briefly introduce the concept of “operationally scalable practices”.
Operationally Scalable Practices are guidelines that drive attitude and policy in order for your organization to scale in four key ways:
These articles should provide guidance as you architect standards and processes for your organization. They will not be theoretical; I have worked in person on very large databases and at organizations managing thousands of databases. I’ve been fortunate to meet and work with many brilliant architects and engineers… I’m hoping that I’ll remember some of what I’ve learned! The guidelines in my articles won’t be perfect and there will be room for improvement – but they are based on real experiences of myself and others.
I plan to start with an article about general guidelines which should establish key principles to underly the technical discussions and set out a more detailed structure for the content. I will follow that with a number of deeper articles on specific topics.
I consider these articles to be works in progress even after I publish them and I will revise them over time as my ideas evolve or as others point out improvements. I hope you stick around and generously share your thoughts and suggestions!
In fact, if you know any specific areas I should cover then please mention them now!
This is the fourth post on a serie of postings on how to get measurements out of the cell server, which is the storage layer of the Oracle Exadata database machine. Up until now, I have looked at the measurement of the kind of IOs Exadata receives, the latencies of the IOs as as done by the cell server, and the mechanism Exadata uses to overcome overloaded CPUs on the cell layer.
This post is about the statistics on the disk devices on the operating system, which the cell server also collects and uses. The disk statistics are ideal to combine with the IO latency statistics.
This is how a dump of the collected statistics (which is called “devio_stats”) is invoked on the cell server, using cellcli:
alter cell events="immediate cellsrv.cellsrv_dump('devio_stats',0)";
This will output the name of the thread-log file, in which the “devio_stats” dump has been made.
This is a quick peek at the statistics this dump provides (first 10 lines):
[IOSTAT] Dump IO device stats for the last 1800 seconds 2013-10-28 04:57:39.679590*: Dump sequence #34: [IOSTAT] Device - /dev/sda ServiceTime Latency AverageRQ numReads numWrites DMWG numDmwgPeers numDmwgPeersFl trigerConfine avgSrvcTimeDmwg avgSrvcTimeDmwgFl 0.000000 0.000000 10 0 6 0 0 0 0 0.000000 0.000000 0.111111 0.111111 15 7 38 0 0 0 0 0.000000 0.000000 0.000000 0.000000 8 4 8 0 0 0 0 0.000000 0.000000 0.000000 0.000000 31 0 23 0 0 0 0 0.000000 0.000000 0.000000 0.000000 8 0 1 0 0 0 0 0.000000 0.000000 0.058824 0.058824 25 0 17 0 0 0 0 0.000000 0.000000 etc.
These are the devices for which the cell server keeps statistics:
grep \/dev\/ /opt/oracle/cell220.127.116.11.1_LINUX.X64_130109/log/diag/asm/cell/enkcel01/trace/svtrc_15737_85.trc [IOSTAT] Device - /dev/sda [IOSTAT] Device - /dev/sda3 [IOSTAT] Device - /dev/sdb [IOSTAT] Device - /dev/sdb3 [IOSTAT] Device - /dev/sdc [IOSTAT] Device - /dev/sde [IOSTAT] Device - /dev/sdd [IOSTAT] Device - /dev/sdf [IOSTAT] Device - /dev/sdg [IOSTAT] Device - /dev/sdh [IOSTAT] Device - /dev/sdi [IOSTAT] Device - /dev/sdj [IOSTAT] Device - /dev/sdk [IOSTAT] Device - /dev/sdl [IOSTAT] Device - /dev/sdm [IOSTAT] Device - /dev/sdn [IOSTAT] Device - /dev/sdo [IOSTAT] Device - /dev/sdp [IOSTAT] Device - /dev/sdq [IOSTAT] Device - /dev/sdr [IOSTAT] Device - /dev/sds [IOSTAT] Device - /dev/sdt [IOSTAT] Device - /dev/sdu
What is of interest here is that if the cell disk is allocated inside a partition instead of the whole disk, the cell server will keep statistics on both the entire device (/dev/sda, dev/sdb) and the partition (/dev/sda3, dev/sdb3). Also, the statistics are kept on both the rotating disks and the flash disks, as you would expect.
When looking in the “devio_stats” dump, there are a few other things which are worthy to notice. The lines with statistics do not have timestamp or other time indicator, it’s only statistics. The lines are displayed per device, with the newest line on top. The dump indicates it dumps the IO device statistics which the cell keeps for the last 1800 seconds (30 minutes). If you count the number of lines which (apparently) are kept by the cell server, the count is 599, not 1800. If you divide the time by the number of samples, it appears the cell takes a device statistics snapshot every 3 seconds. The cell server picks up the disk statistics from /proc/diskstats. Also, mind the cell measures the differences between two periods in time, which means the numbers are averages over a period of 3 seconds.
Two other things are listed in the statistics: ‘trigerConfine’ (which probably should be “triggerConfine”), which is a mechanism for Oracle to manage under performing disks.
The other thing is “DMWG”. At this moment I am aware DMWG means “Disk Media Working Group”, and works with the concept of peers.
To get a better understanding of what the difference is between the ServiceTime and Latency columns, see this excellent writeup on IO statistics from Bart Sjerps. You can exchange the ServiceTime for svctm of iostat or storage wait as Bart calls it, and Latency for await or host wait as Bart calls it.
Just a quick plug for an upcoming OTN tour, that I’m not on this year.
I attended the OTN Asia Pacific Tour in 2011, which included some of these locations. Hopefully I will get to do it again in the near future.
Like any conferences, these events are all about the attendees, so the more people that turn up and the more vocal the attendees, the better the events are.
I’ve seen the agendas for the Auckland and Perth events and they look cool. I hope everyone has a good time!
The fundamental underpinning of the math and costs under mandated universal care NOT paid for by the government is people who have continuously paid for insurance are now pooled with folks who previously rolled the dice on the costs of chronic ailments. There is no reward for having paid to not take the risk over a long period and we are now burdened directly with the cost of the losers. Now, historically the culture in this country is that buying insurance is an individual risk choice. Don’t buy fire insurance? If your house does not burn down, big win. Most people choose to not take that risk and if you have a mortgage the bank makes the safe choice for you because their experience is based on a pretty good proxy for the entire risk pool. It would be entirely possible to craft a system that rewarded people for their past contributions to the shared risk, but that was not done. In the name of helping the poor, the ACA has taken advantage of and done damage to those who have previously paid for insurance. We would be far better off to deal directly with ending poverty for those legally resident in the US with something like the Friedman-Moynihan-Nixon plan and allowing a choice of whether to buy insurance. Or, in addition to a F-M-N negative income tax plan we could just pay for “A and E” (accident and emergency) care from the federal level with a credit over time to the folks who previously paid for insurance and let insurance for chronic problems remain a choice. Of course the most expensive choice is universal coverage of both “A and E” and chronic problems. There is zero chance we can meet everyone’s every need until we routinely have “Star Trek” era technology, so don’t hold your breath, and plan for rationed care if you want that option. Just like phasing out the Ponzi scheme of social security, we should pay off our obligations to those with sunk cost in the prior system when we completely change the rules of the “game.” This just in from the feds: The reason exchange coverage in Vermont is so high is lack of competition. Sigh. No kidding.
After successfully upgrading two laptops to Windows 8.1, Captain Support flew back to his secret server room and continued to monitor the world’s communications, waiting for the next opportunity to allow mere mortals to witness his greatness. That opportunity came when his sister-in-law emailed to say that his nephew’s football academy website was not working properly…
Having read some reports about broken websites, Captain Support assumed the problem was because of Internet Explorer 11, so he used LogMeIn to connect to his sister-in-law’s laptop, installed FireFox and tested the website. It worked perfectly. The site also worked fine using Chrome.
Having saved the day yet again, Captain Support carried on with the business of saving the rest of the world by turning things off and on again…
After getting back from the OTN Nordic Tour 2013, I figured it was time to give OS X Mavericks a go.
I’m currently using a MacBook Pro (13-inch, Mid 2009). It’s a little long in the tooth, but it has 8G RAM and a 256G SSD, so it still performs pretty well. At least well enough for me not to replace it just yet.
The download took about 30 minutes. I guess I’m a little behind the curve here because lots of people complained about the download times. It pays to hold off for a few days. The installation took about the same amount of time too, so after about an hour I had Mavericks up and running.
Several people reported really slow performance after the upgrade. So far it looks pretty much the same to me.
I had already read Jason Arneil‘s article about VirtualBox 4.3 on OS X Mavericks, which saved me a lot of time. I can’t live without VirtualBox, so any OS that can’t run it is out of the Window for me. I had similar issues to those he saw and fixed them in the same way. Thanks Jason!
So now everything is running as normal. If anything scary jumps out I will report…
The Hyperion Cantos is essentially two stories. The first one split over the first two books and the second split over books three and four. The two stories are separated by about 300 years, but there are some links and even common characters. Throughout the books the characters and scenarios were consistently interesting, but the books themselves were not always so consistently good to read. The Rise of Endymion is a good example of that. There are some totally excellent sections of the book and some that could just do with being cut completely. There was a section describing the mountain ranges of a planet and I just found myself thinking, “WTF is the author expecting readers to think here? It’s a string of made up names for mountains that don’t exist. What a waste of words…”
Despite the issues, I was extremely interested to see how things turned out. Who lived, who died, did the Pax/Church get exposed and overthrown… In that sense, the book delivered very well.
On reflection, the series reminds me a lot of the Dune series. A combination of exceptional high points and some rather lacklustre sections that test your loyalty. Both series are well worth the effort though…
Couple of weeks ago I discovered NodeJS and I decided to come back to my old project and this time use this lightweight server to work as backend server. I couldn't find NodeJS database driver for Oracle but I decide to present performance data using DBMS_EPG and PL/SQL procedures. Data are taken from OraSASH (but it can be used for ASH/AWR as well) and encoded as JSON. Browser is using AJAX to call NodeJS and NodeJS is connecting to Oracle DB to take required data.
This project is on very early stage so stay tuned - you can find initial code here
Please find screen shots from VISASH project:
One instance view
Two instances view