In part I of my recent blog series on Linux hugepages and modern Oracle releases I closed the post by saying that future installments would materialize if I found any pitfalls. I don’t like to blog about bugs, but in cases where there is little material on the matter provided elsewhere I think it adds value. First, however, I’d like to offer links to parts I and II in the series:
The pitfall I’d like to bring to readers’ attention is a situation that can arise in the case where the Oracle Database 11g Release 2 184.108.40.206 parameter use_large_pages is set to “only” thus forcing the instance to either successfully allocate all shared memory from the hugepages pool or fail to boot. As I pointed out in parts I and II this is a great feature. However, after an instance is booted it stands to reason that other processes (e.g., Oracle instances) may in fact use hugepages thus drawing down the amount of free hugepages. In fact, it stands to reason that other uses of hugepages could totally deplete the hugepages pool.
So what happens to a running instance that successfully allocated its shared memory from the hugepages pool and hugepages are later externally drawn down? The answer is nothing. An instance can plod along just fine after instance startup even if hugepages continue to get drawn down to the point of total depletion. But is that the end of the story?
What Goes Up, Must (be able to) Come Down
OK, so for anyone that finds themselves in a situation where an instance is up and happy but HugePages_Free is zero the following is what to expect:
$ sqlplus '/ as sysdba' SQL*Plus: Release 220.127.116.11.0 Production on Wed Sep 29 17:32:32 2010 Copyright (c) 1982, 2010, Oracle. All rights reserved. Connected to an idle instance. SQL> SQL> HOST grep -i huge /proc/meminfo HugePages_Total: 4663 HugePages_Free: 0 HugePages_Rsvd: 10 Hugepagesize: 2048 kB SQL> shutdown immediate ORA-01034: ORACLE not available ORA-27102: out of memory Linux-x86_64 Error: 12: Cannot allocate memory Additional information: 1 Additional information: 6422533 SQL>
Pay particular attention to the fact that sqlplus is telling us that it is attached to an idle instance! I assure you, this is erroneous. The instance is indeed up.
Yes, this is bug 10159556 (I filed it for what it is worth). The solution is to have ample hugepages as opposed to precisely enough. Note, in another shell a privileged user can dynamically allocate more hugepages (even a single hugepage) and the instance will be then able to be shutdown cleanly. As an aside, an instance in this situation can be shutdown with abort. I don’t aim to insinuate that this is some sort of zombie instance that will not go away.
Filed under: oracle