Who's online

There are currently 0 users and 20 guests online.

Recent comments

Oakies Blog Aggregator


“A beginning is the time for taking the most delicate care that the balances are correct.”

It is spring. Time for planting new seeds. I started on a new job last week, and it seems that few of my friends and former colleagues are on their way to new adventures as well. I’m especially excited because I’m starting not just a new job – I will be working on a new product, far younger than Oracle and even MySQL. I am also making first tiny steps in the open-source community, something I’ve been looking to do for a while.

I’m itching to share lessons I’ve learned in my previous job, three challenging and rewarding years as a consultant. The time will arrive for those, but now is the time to share what I know about starting new jobs. Lessons that I need to recall, and that my friends who are also in the process of starting a new job may want to hear.

Say hello
I’m usually a very friendly person and after years of attending conferences I’m very comfortable talking to people I’ve never met before. But still, Cloudera has around 200 people in the bay area offices, which means that I had to say “Hello, I’m Gwen Shapira the new Solutions Architect, who are you?” around 200 times. This is not the most comfortable feeling in the world. Its important to go through the majority of the introductions in the first week or two, later on it becomes a bit more awkward. So in the first week it will certainly seem like you are doing nothing except meeting people, chatting a bit and franctically memorizing names and faces. This is perfectly OK.

Get comfortable being unproductive
The first week in a new job feels remarkably unproductive. This is normal. I’m getting to know people, processes, culture, about 20 new products and 40 new APIs. I have incredibly high expectations of myself, and naturaly I’m not as fast installing Hadoop cluster as I am installing RAC cluster. It takes me far longer to write Python code than it does to write SQL. My expectations create a lot of pressure, I internally yell at myself for taking an hour or so to load data into Hive when it “should” have taken 5 minutes. But of course, I don’t know how long it “should” take, I did it very few times before. I’m learning and while learning has its own pace, it is an investment and therefore productive.

Have lunch, share drinks
The best way to learn about culture is from people, and the best way to learn about products is from the developers who wrote them and are passionate about how they are used. Conversations at lunch time are better than tackling people in the corridor or interrupting them at their desk. Inviting people for drinks are also a great way to learn about a product. Going to someones cube and asking for an in-depth explanation of Hive architecture can be seen as entitled and bothersome. Sending email to the internal Hive mailing list and saying “I’ll buy beer to anyone who can explain Hive architecture to me” will result in a fun evening.

If its not overwhelming, you may be in the wrong job
I’m overwhelmed right now. So many new things to learn. First there are the Hadoop ecosystem products, I know some but far from all of them, and I feel that I need to learn everything in days. Then there is programming. I can code, but I’m not and never have been a proficient programmer. My colleagues are sending out patches left and right. It also seems like everyone around me is a machine learning expert. When did they learn all this? I feel like I will never catch up.

And that is exactly how I like it.

Make as many mistakes as possible
You can learn faster by doing, and you can do faster if you are not afraid of failing and making mistakes. Mistakes are more understandable and forgivable when you are new. I suggest using this window of opportunity and accelerate your learning by trying to do as much as possible. When you make a mistake smile and say “Sorry about that. I’m still new. Now I know what I should and shouldn’t do”

Take notes
When you are new a lot of things will look stupid. Sometimes just because they are very different from the way you are used to things in a previous job. Don’t give in to the temptation to criticise everything, because you will look like a whiner. No one likes whiner. But take note of them, because you will get used to them soon and never see things with “beginner mind” again. In few month take a look at your list, if things still look stupid, it will be time to take on a project or two to fix them.

I may be new at this specific job, but I still have a lot to contribute. I try hard to look for opportunities and I keep finding out that I’m more useful than I thought. I participate in discussions in internal mailing lists, I make suggestions, I help colleagues solve problems. I participate in interviews and file tickets when our products don’t work as expected. I don’t wait to be handed work or to be sent to a customer, I look for places where I can be of use.

I don’t change jobs often. So its quite possible that I don’t know everything there is to know about starting a new job. If you have tips and suggestions to share with me and my readers, please comment!

Chapterhouse: Dune

Chapterhouse Dune is the last in the Dune series by Frank Herbert.

It’s really hard for me to make a judgement about Chapterhouse: Dune. On the one hand there are some excellent characters and the general story line is great. On the other, there are parts I found really boring. I got a bit sick of the teasers without any explanation. At first is was intriguing, but as they continued I just got a bit fed up with them and decided to stop second guessing the outcome and just let it happen. I think there are two ways an author can play this game:

1) Make the outcome fairly obvious from the start, but make the journey to get there exciting. Kind of like The Dresden Files.
2) Make the outcome a mystery, but subtly lead you in the right direction.

I think this book is trying to do the latter, but is quite clumsy about it. Having said all that, I’m glad I read it. The overall outcome is more than satisfactory.

I’m not going to read the books by Frank Herbert’s son. I’ve been told they are not good, and the brief snippets I’ve read seem to reinforce that.

I guess the end of a series of books like this needs a bit of a summary. I think the first book is a total classic. The rest you can take or leave. There are definitely interesting elements to all of them, but they are not nearly as accomplished as the first.



Chapterhouse: Dune was first posted on May 19, 2013 at 3:15 pm.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

BGOUG Spring 2013 : Day 2…

How do you want to start the day? I’m guessing it’s not to be called out to the front of the room by a speaker and used as a guinea pig, while they ask you trick questions to make you look stupid. Tom Kyte, you will pay. Oh yes! You will pay!!!

The sessions I attended on day 2 were:

  • Tom Kyte : What’s new in Oracle database application development
  • Tim Hall (me) : A cure for Virtual Insanity : A vendor-neutral introduction to virtualization without the hype
  • Georgi Kodinov : Quick Dive into MySQL
  • Tim Hall (me) : From Zero to Hero : Using an assortment of caching techniques to improve performance of SQL containing PL/SQL calls
  • Husnu Sensoy : ZFS Storage can backup your Exadata
  • Tom Kyte : 5 SQL and PL/SQL things in the Latest Generation of Database Technology

Another very useful day indeed. I had some good feedback and interesting questions about my talks. This sort of feedback is really important when you are presenting regularly as it allows you to continuously refine your material and presenting skills. It can sometimes give you a fresh perspective on a subject, that inspires you to alter the focus of your presentations entirely. I’m very grateful to anyone who takes the time to provide this sort of feedback. Big thanks to Tom Kyte, who has given me some very useful advice over the last couple of days, but then he owes me for making me look stupid in his first session of the day! :)

In the evening we went out for dinner at a restaurant just down the road from the hotel. I ate plenty of cheese, so I was in heaven. Not surprisingly, much of the talk ended up being about Oracle. It may seem a little sad to some people, but when I’m surrounded by people with brains the size of a planet, I can’t help myself quizzing them about this stuff. I love it! :)

Great big thanks go out to Milena and her gang for organizing this event and inviting me. Thanks also to Stoyan for being my driver again. No offence to other user groups, but BGOUG conferences are my favorite events of the year. I will keep coming back as long as you will have me! Also, a big thank you to the Oracle ACE program for making this possible.



BGOUG Spring 2013 : Day 2… was first posted on May 19, 2013 at 3:10 pm.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

BGOUG Spring 2013 : Day 1 (part 2)…

So Day 1 (part 2) didn’t go to plan because I forgot to take my camera or my phone to the party. :)

Suffice to say, lots of food, lots of drink (for those that do) and most importantly lots of dancing. Yes, I once again murdered the traditional dances of Bulgaria, but it’s the takling part that counts right? :)

I had good intentions of leaving early, but I ended up chatting about Oracle until about 02:00. Day 2 is going to be tough… :)



BGOUG Spring 2013 : Day 1 (part 2)… was first posted on May 19, 2013 at 3:05 pm.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

BGOUG Spring 2013 : Day 1 (part 1)…

Last night we all got together to eat some food and chat. Julian Dontcheff is practically a savant where Bulgarian Poetry, World Cup match results and random Oracle facts are concerned. Although Christian Antognini was pretty impressive on the random Oracle facts too. :)

I didn’t have any presentations today, so I got to sit and watch. :) I’ve done loads of typing, mostly of syntax for 12c features, but it’s not really stuff that is worth posting, because I have no way to validate it out, so I’m just going to keep it as a reminder for when I get hold of 12c and can try it out.

The sessions I went to included:

  • Joze Senegacnik : Is my SQL Statement Using Exadata Features
  • Christian Antognini : SQL New Features in the latest generation of Oracle Database
  • Julian Dontcheff : Upgrading to the latest generation of database technology
  • Christian Antognini : How the Query Optimizer Learns from its Mistakes
  • Clive King : Solaris 11u1 performance and stability : features and frameworks
  • Tom Kyte : Tom’s Top 12 Things about the Latest Generation of Database Technology

There was a lot of material I had seen at OOW2012 and UKOUG2012, but also a lot I had not, so I’m glad I went to them. The smaller setting also made it easier to ask questions, which can be quite daunting at the big events. :)

Tom gave me a couple of tips that have gone straight into one of my talks for tomorrow. I’m gonna have to name check him for it, or I’ll feel like I’m passing it off as my own. :)

I said this after OOW2013 and I’m sure I will say it again, but the number of changes in 12c is pretty daunting. I guess the fact it’s been about a 3 year wait, rather than the normal 18 months adds to that. In many cases (but not all) it’s not the scope of the individual changes that are the issue, but the sheer volume of them. I think people are going to be blogging for a long time before they’ve got through them. It will be interesting to see what gets selected for inclusion in the OCP DBA upgrade exam. :)

I’m off to dinner now. I will try to get some photos and post them in “Day 1 (part 2)@ tomorrow. :)



BGOUG Spring 2013 : Day 1 (part 1)… was first posted on May 17, 2013 at 6:41 pm.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.



The call for papers for UKOUG Tech 13 closed yesterday – though you may still be able to get something in if you visit the website before Monday. I’ve submitted a total of five – two of them aimed at novice DBAs and trouble-shooters – so I’ll probably get a couple of slots; and Tony Hasler has linked my name with a “round table”  (CBO – what else ?) and a “weird and whacky” which will be a reprise of the debate we had a couple of years ago about Oracle’s treatment of hints.

More importantly, for the immediate future, though – I’ll be speaking at the OUG Scotland conference which is only a few days away (12th June). It’s a one-day event with 6 concurrent streams of information covering DBA, Developer, APEX, BI, Apps and Cloud – so something for everyone.

The full agenda is here, and (free) registration details here. I’ll be talking about compression, and I’m particularly interested to hear what Julian Dyke will have to say about sorting (and other PGA hogs). I will have a problem in the afternoon, though – torn between heckling Doug Burns as he talks about the 10053 trace, or going to learn more about the (rarely used) model clause from Tony Hasler.

“alter session force parallel query”, and indexes

This post is a brief discussion about the advantages of activating parallelism by altering the session environment instead of using the alternative ways (hints, DDL). The latter ways are the most popular in my experience, but I have noticed that their popularity is actually due, quite frequently, more to imperfect understanding rather than informed decision - and that's a pity since "alter session force parallel query" can really save everyone a lot of tedious work and improve maintainability a great deal.

We will also check that issuing

alter session force parallel query parallel N;

is the same as specifying the hints

/*+ parallel (t,N)  */
/*+ parallel_index (t, t_idx, N) */

for all tables referenced in the query, and for all indexes defined on them (the former is quite obvious, the latter not that much).

Side note: it is worth remembering that hinting the table for parallelism does not cascade automatically to its indexes as well - you must explicitly specify the indexes that you want to be accessed in parallel by using the separate parallel_index hint (maybe specifying "all indexes" by using the two-parameter variant "parallel_index(t,N)"). The same holds for "alter table parallel N" and "alter index parallel N", of course.

the power of "force parallel query"

I've rarely found any reason for avoiding index parallel operations nowadays - usually both the tables and their indexes are stored on disks with the same performance figures (if not the same set of disks altogether), and the cost of the initial segment checkpoint is not generally different. At the opposite, using an index can offer terrific opportunities for speeding up queries, especially when a full table scan can be substituted by a fast full scan on a (perhaps much) smaller index.

Thus, I almost always let the CBO consider index parallelism as well. Three methods can be used:
- statement hints (the most popular option)
- alter table/index parallel N
- "force parallel query".

I rather hate injecting parallel hints everywhere in my statements since it is very risky. It is far too easy to forget to specify a table or index (or simply misspell them), not to mention to forget new potentially good indexes added after the statement had been finalized. Also, you must change the statement as well even if you simply want to change the degree of parallelism, perhaps just because you are moving from an underequipped, humble and cheap test environment to a mighty production server. At the opposite, "force parallel query" is simple and elegant - just a quick command and you're done, and with a single place to touch in order to change the parallel degree.

"alter table/index parallel N" is another weak technique as well in my opinion, mainly for two reasons. The first one is that it is a permanent modification to the database objects, and after the query has finished, it is far too easy to fail to revert the objects back to their original degree setting (because of failure or coding bug). The second one is the risk of two concurrent sessions colliding on the same object that they both want to read, but with different degrees of parallelism.
Both the two problems above do not hold only when you always want to run with a fixed degree for all statements; but even in this case, I would consider issuing "force parallel query" (maybe inside a logon trigger) instead of having to set/change the degree for all tables/indexes accessed by the application.

I have noticed that many people are afraid of "force parallel query" because of the word "force", believing that it switches every statement into parallel mode. But this is not the case: as Tanel Poder recently illustrated, the phrase "force parallel query" is misleading; a better one would be something like "consider parallel query", since it is perfectly equivalent to hinting the statement for parallelism as far as I can tell (see below). And hinting itself tells the CBO to consider parallelism in addition to serial execution; the CBO is perfectly free to choose a serial execution plan if it estimates that it will cost less - as demonstrated by Jonathan Lewis years ago.
Hence there's no reason to be afraid, for example, that a nice Index Range Scan that selects just one row might turn into a massively inefficient Full Table Scan (or index Fast Full Scan) of a one million row table/index. That is true besides bugs and CBO limitations, obviously; but in these hopefully rare circumstances, one can always use the no_parallel and no_parallel_index to fix the issue.

"force parallel query" and hinting: test case

Let's show that altering the session is equivalent to hinting. I will illustrate the simplest case only - a single-table statement that can be resolved either by a full table scan or an index fast full scan (check script force_parallel_main.sql in the test case), but in the test case zip two other scenarios (a join and a subquery) are tested as well. Note: I have only checked and (but I would be surprised if the test case could not reproduce in 10g as well).

Table "t" has an index t_idx on column x, and hence the statement

select sum(x) from t;

can be calculated by either scanning the table or the index. In serial, the CBO chooses to scan the smaller index (costs are from

select /* serial */ sum(x) from t;
|Id|Operation             |Name |Cost|
| 0|SELECT STATEMENT      |     | 502|
| 1| SORT AGGREGATE       |     |    |

If we now activate parallelism for the table, but not for the index, the CBO chooses to scan the table:

select /*+ parallel(t,20) */ sum(x) from t
|Id|Operation              |Name    |Cost|
| 0|SELECT STATEMENT       |        | 229|
| 1| SORT AGGREGATE        |        |    |
| 2|  PX COORDINATOR       |        |    |
| 3|   PX SEND QC (RANDOM) |:TQ10000|    |
| 4|    SORT AGGREGATE     |        |    |
| 5|     PX BLOCK ITERATOR |        | 229|
| 6|      TABLE ACCESS FULL|T       | 229|

since the cost for the parallel table access is now down from the serial cost of 4135 (check the test case logs) to the parallel cost 4135 / (0.9 * 20) = 229, thus less than the cost (502) of the serial index access.

Hinting the index as well makes the CBO apply the same scaling factor (0.9*20) to the index as well, and hence we are back to index access:

select /*+ parallel_index(t, t_idx, 20) parallel(t,20) */ sum(x) from t
|Id|Operation                 |Name    |Cost|
| 0|SELECT STATEMENT          |        |  28|
| 1| SORT AGGREGATE           |        |    |
| 2|  PX COORDINATOR          |        |    |
| 3|   PX SEND QC (RANDOM)    |:TQ10000|    |
| 4|    SORT AGGREGATE        |        |    |
| 5|     PX BLOCK ITERATOR    |        |  28|
| 6|      INDEX FAST FULL SCAN|T_IDX   |  28|

Note that the cost computation is 28 = 502 / (0.9 * 20), less than the previous one (229).

"Forcing" parallel query:

alter session force parallel query parallel 20;

select /* force parallel query  */ sum(x) as from t
|Id|Operation                 |Name    |Cost|
| 0|SELECT STATEMENT          |        |  28|
| 1| SORT AGGREGATE           |        |    |
| 2|  PX COORDINATOR          |        |    |
| 3|   PX SEND QC (RANDOM)    |:TQ10000|    |
| 4|    SORT AGGREGATE        |        |    |
| 5|     PX BLOCK ITERATOR    |        |  28|
| 6|      INDEX FAST FULL SCAN|T_IDX   |  28|

Note that the plan is the same (including costs), as predicted.

Side note: let's verify, just for fun, that the statement can run serially even if the session is "forced" as parallel (note that I have changed the statement since the original always benefits from parallelism):

alter session force parallel query parallel 20;

select /* force parallel query (with no parallel execution) */ sum(x) from t
|Id|Operation         |Name |Cost|
| 0|SELECT STATEMENT  |     |   3|
| 1| SORT AGGREGATE   |     |    |

Side note 2: activation of parallelism for all referenced objects can be obtained, in, using the new statement-level parallel hint (check this note by Randolf Geist for details):

select /*+ parallel(20) */ sum(x) from t
|Id|Operation                 |Name    |Table|Cost|
| 0|SELECT STATEMENT          |        |     |  28|
| 1| SORT AGGREGATE           |        |     |    |
| 2|  PX COORDINATOR          |        |     |    |
| 3|   PX SEND QC (RANDOM)    |:TQ10000|     |    |
| 4|    SORT AGGREGATE        |        |     |    |
| 5|     PX BLOCK ITERATOR    |        |     |  28|
| 6|      INDEX FAST FULL SCAN|T_IDX   |T    |  28|

This greatly simplifies hinting, but of course you must still edit the statement if you need to change the parallel degree.

Linux large pages and non-uniform memory distribution

In my last post about large pages in I promised a little more background information on how large pages and NUMA are related.

Background and some history about processor architecture

For quite some time now the CPUs you get from AMD and Intel both are NUMA, or better: cache coherent NUMA CPUs. They all have their own “local” memory directly attached to them, in other words the memory distribution is not uniform across all CPUs. This isn’t really new, Sequent has pioneered this concept on x86 a long time ago but that’s in a different context. You really should read Scaling Oracle 8i by James Morle which has a lot of excellent content related to NUMA in it, with contributions from Kevin Closson. It doesn’t matter that it reads “8i” most of it is as relevant today as it was then.

So what is the big deal about NUMA architecture anyway? To explain NUMA and why it is important to all of us a little more background information is on order.

Some time ago processor designers and architects of industry standard hardware could no longer ignore the fact that a front side bus (FSB) proved to be a bottleneck. There were two reasons for this: it was a) too slow and b) too much data had to go over it. As one direct consequence DRAM memory has been directly attached to the CPUs. AMD has done this first with it’s Opteron processors in its AMD64 micro architecture, followed by Intel’s Nehalem micro architecture. By removing the requirement of data retrieved from DRAM to travel across a slow bus latencies could be removed.

Now imagine that every processor has a number of memory channels to which DDR3 (DDR4 could arrive soon!) SDRAM is attached to. In a dual socket system, each socket is responsible for half the memory of the system. To allow the other socket to access the corresponding other half of memory some kind of interconnect between processors is needed. Intel has opted for the Quick Path Interconnect, AMD (and IBM for p-Series) use Hyper Transport. This is (comparatively) simple when you have few sockets, up to 4 each socket can directly connect to every other without any tricks. For 8 sockets it becomes more difficult. If every socket can directly communicate with its peers the system is said to be glue-less which is beneficial. The last production glue-less system Intel released was based on the Westmere architecture. Sandy Bridge (current until approximately Q3/2013) didn’t have an eight-way glue-less variant, and this is exactly why you get Westmere-EX in the X3-8, and not Sandy Bridge as in the X3-2.

Anyway, your system will have local and remote memory. For most of us, we are not going to notice this at all since there is little point in enabling NUMA on systems with two sockets. Oracle still recommends that you only enable NUMA on 8 way systems, and this is probably the reason the oracle-validated and preinstall RPMs add “numa=off” to the kernel command line in your GRUB boot loader.

Booting with NUMA enabled

The easiest way to boot with NUMA enabled is to get to your ILOM and boot the server. As soon as the GRUB line (“booting … in x seconds”) appears, hit a key. You will be dropped into the GRUB menu. It should highlight the default boot entry (Oracle Linux Server (…x86-64). Hit the “e” key to edit the directives. You should see something like this now:

root (hd0,0)
kernel / ....
initrd /

Move the cursor to the line starting with kernel, then hit “e” again. The cursor will move to the end of the line, where you will find the numa=off directive. Hit the backspace key to remove numa=off, then hit return (it will bring you back to the previous 3 directions), then “b” to boot this configuration.

This is useful because it doesn’t involve editing the grub menu file, and if something should break you can simply restart and are back in a known good configuration.

Now when you log in as root you will notice that NUMA is turned on!

Signs of NUMA

My lab server is an AMD 6238 dual socket workstation with 32GB of RAM. To see the effect of NUMA, you can make use of the numactl tool:

[root@ol62 ~]# numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5
node 0 size: 8190 MB
node 0 free: 1637 MB
node 1 cpus: 6 7 8 9 10 11
node 1 size: 8192 MB
node 1 free: 1732 MB
node 2 cpus: 12 13 14 15 16 17
node 2 size: 8192 MB
node 2 free: 1800 MB
node 3 cpus: 18 19 20 21 22 23
node 3 size: 8176 MB
node 3 free: 1745 MB
node distances:
node   0   1   2   3
  0:  10  16  16  16
  1:  16  10  16  16
  2:  16  16  10  16
  3:  16  16  16  10

You need to know that Opteron reports twice the number of NUMA nodes than there are sockets since their 6100 series. These processors are multi-module chips on the same die. Each of the sockets has 12 cores or better: modules. AMD’s processors are somewhere between HyperThreads and cores, to which extent I can’t tell. The server reports 24 CPUs in any case.

My configuration has allocated 12295 large pages at boot time or roughly 24 GB out of 32GB available. You can see how many pages have been allocated per CPU node in the first half of the output. Luckily the memory has been requested evenly across all NUMA nodes.

The second part of the numactl output gives you the node distances in a matrix. The numbers are provided by the Operating System at boot time in form of the System Locality Table (SLIT) and cannot be changed. They indicate the cost of accessing remote memory. 10 seems to be the base value for this parameter for local access. Higher values indicate more overhead.


The SYS pseudo file system is set to replace the venerable /proc file system. The SYSFS exports more information than /proc does, which is apparent when it comes to memory allocation per NUMA node. Per node NUMA statistics are in /sys/devices/system/node*

Two files are out of interest, numastat and meminfo. I won’t go into detail for numastat (yet another post will follow), but meminfo is interesting.

[root@ol62 node0]# cat meminfo
Node 0 MemTotal:        8386572 kB
Node 0 MemFree:         1685988 kB
Node 0 MemUsed:         6700584 kB
Node 0 Active:            10516 kB
Node 0 Inactive:          12704 kB
Node 0 Active(anon):       2656 kB
Node 0 Inactive(anon):        0 kB
Node 0 Active(file):       7860 kB
Node 0 Inactive(file):    12704 kB
Node 0 Unevictable:        1172 kB
Node 0 Mlocked:            1172 kB
Node 0 Dirty:                 0 kB
Node 0 Writeback:             0 kB
Node 0 FilePages:         21276 kB
Node 0 Mapped:             2960 kB
Node 0 AnonPages:          3156 kB
Node 0 Shmem:               116 kB
Node 0 KernelStack:        1384 kB
Node 0 PageTables:          528 kB
Node 0 NFS_Unstable:          0 kB
Node 0 Bounce:                0 kB
Node 0 WritebackTmp:          0 kB
Node 0 Slab:              23788 kB
Node 0 SReclaimable:       5652 kB
Node 0 SUnreclaim:        18136 kB
Node 0 AnonHugePages:         0 kB
Node 0 HugePages_Total:  3074
Node 0 HugePages_Free:   3074
Node 0 HugePages_Surp:      0

This file is similar to /proc/meminfo but only relevant for node0, i.e. the first 6 “cores” on my system. Here you can see the large page allocation on this node.

Why does this matter

When you are consolidating lots of environments to your system with lots of sockets, you should try and stick to memory locality. Keep instances on a socket if possible, today’s servers can take a lot of memory and you shouldn’t have to use remote memory this avoiding latency. I personally would use control groups to ensure my instances stay where I want them to stay. There are other ways to control memory distribution (see some of the SLOB examples) but cgroups are by far the most elegant.

Using NUMA on your system and leaving it to chance how memory is distributed will lead to difficult-to-predict performance. You might even run out of memory on a local node causing unexpected problems. As with everything, understanding and tuning a configuration is the way to go! I will run a few benchmarks next to demonstrate the difference between local and remote memory access. Unfortunately I don’t have a 4-way system available for these tests-normally you wouldn’t really worry about NUMA settings on less than four cores.


Don’t go and rush your systems to NUMA! Like I said, there is little to be gained in about 80% of all servers out there on dual-socket systems. Four-way servers might be candidates for NUMA, 8 way are candidates. By saying candidates I mean if you understand NUMA and how it can affect your application, and have really load tested it and only if it provided to be predictable, stable performance, then I would think of enabling NUMA for a production workload. There is nothing like thorough testing that can tell you how your application will perform. I guess all I want to say is that turning on NUMA can have negative performance impact as well, or even crash your Oracle instance if the memory on a NUMA node is depleted. Search MOS for NUMA to get more information.


Extreme Exadata Expo Speakers Announced

Thanks to everyone that submitted abstracts for our upcoming E4 conference. Unfortunately, there were more quality submissions than we had room for. Maybe next year we should expand the event to 3 days. :) But in the meantime, we have assembled what I believe is an excellent line up of speakers. I’ll just mention a few highlights here:

Tom Kyte will be doing the keynote. Enough said!

Maria Colgan and Roger MacNicol will be doing a 3 hour combined session on smart scans. Maria will attack the topic from the top down (optimizer) point of view (since she is the product manager for the optimizer) and Roger will be attacking it from the bottom up (since he is the lead developer for the smart scan code). This should be an awesome session and Tanel Poder has already said he was going to line up the night before.

Ferhat Sgonul will be talking about Turkcell’s usage of Exadata. Turkcell is one of the earliest adopters of Exadata and has had great success with it over the last several years, so this should be a very interesting case study.

Karl Arao and Tyler Muth will do a joint presentation on visualization techniques for performance data from Exadata environments. The plan is for them to compare and contrast their approaches using the same data set. Tyler usually uses R and Karl likes Tableu – may the best violin chart win.

Tyler Muth will also be doing a deep dive presentation on bloom filters and how they can be offloaded with smart scans. This is a topic about which there is little information, so it should be quite interesting.

Frits Hoogland will be doing a deep dive on how Oracle does multi-block i/o. This is of special interest with regard to Exadata because the direct path mechanism for doing multi-block i/o is a requirement for enabling smarts scans. So understanding how it works is one of the keys to getting the most out the platform.

Sue Lee (product manager for resource manager) will be doing a session on how to deal with mixed workloads. I’m really interested in this session as IORM and DBRM are critical for managing Exadata, particularly when it is used as a consolidation platform.

There are many other well known speakers including Martin Bach, Andy Colvin, Gwen Shapira, Mark Rittman, Tim Fox and Tanel Poder.

Here’s a link to see the complete line up of E4 speakers.

While we’re on the subject, I should mention that there will be several talks on hadoop related topics and the increasingly expanding role it is playing in our industry. The idea of pushing the work to the storage is not unique to Exadata. It is also the main driver behind hadoop. So I’m extremely pleased to announce that Doug Cutting will be speaking at E4 as well.

So that’s all for the marketing related stuff on E4. I hope you can join us in Dallas.

BGOUG Spring 2013 : Day -1

It’s stupid o’clock in the morning and I’m waiting for my taxi to arrive. Considering how close Bulgaria is, it takes me a very long time to get there.

I am a mix of excited and nervous. This is my first conference this year, so all the usual insecurities are in full effect, from fear of flying to the constant nagging thoughts that perhaps I don’t know anything about Oracle and maybe I shouldn’t be on stage acting like I do. :)

I’m sure it will go OK and it will be nice to meet up with the gang again.



BGOUG Spring 2013 : Day -1 was first posted on May 16, 2013 at 4:55 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.