Search

Top 60 Oracle Blogs

Recent comments

Exadata

Update, things of interest and a couple of blogs to check out

tech.E2SN secret hacking session on Tuesday 22nd March:

Just in case you missed it – there’s still chance to sign up to my tomorrow’s ORA-4031 and shared pool hacking session. I initially planned to limit the attendees to 100 per event (as the limited GotoWebinar package is cheaper that way) but over 100 people had signed up for the US event on the day of announcement, even before it was 8am in California, so I figured I should invest a bit more and allow more people attend. So far over 500 people have signed up (total for both events). If you haven’t done so, you can sign up here:

Advanced Oracle Troubleshooting online seminar Deep Dives 1-5  on 11-15 April:

The next AOT deep dives (1-5) will start in 3 weeks, on 11-15 April. (and 6-10 will be on 9-13 May).

Check the details here:

Blogs to check out:

Andrey Nikolaev has done some serious low-level research on Oracle latches and KGX mutexes and he also presented his work this year at Hotsos Symposium (I missed his session as I was stuck in JFK instead of attending the conference on that day):

Porus Havewala is quite a Grid Control and OEM enthusiast. If you are into OEM & GC, check out his blog:

Future events:

  • I will be speaking at the UKOUG Exadata Special Event on 18th April
  • I will announce some more Virtual Conferences pretty soon!!! Very interesting topics and good speakers – including (but not limited to) some serious Exadata technical contents!

Share

Exadata CAN do smart scans on bitmap indexes

As I’m finishing up a performance chapter for the Exadata book (a lot of work!), I thought to take a quick break and write a blog entry.

This is not really worth putting into my Oracle Exadata Performance series (which so far has only 1 article in it anyway) .. so this is a little stand-alone article …

Everybody knows that the Exadata smart scan can be used when scanning tables (and table partitions). You should also know that smart scan can be used with fast full scan on Oracle B-tree indexes (a fast full scan on an index segment is just like a full table scan, only on the index segment (and ignoring branch blocks)).

For some reason there’s a (little) myth circulating that smart scans aren’t used for scanning bitmap indexes.

So, here’s evidence, that smart scan can be used when scanning bitmap indexes:

SQL> select /*+ tanel3 */ count(*) from t1 where owner like '%XYZXYZ%';

...

Plan hash value: 39555139

-----------------------------------------------------------------------------------
| Id  | Operation                             | Name        | E-Rows | Cost (%CPU)|
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |             |        |   505 (100)|
|   1 |  SORT AGGREGATE                       |             |      1 |            |
|   2 |   BITMAP CONVERSION COUNT             |             |    400K|   505   (0)|
|*  3 |    BITMAP INDEX STORAGE FAST FULL SCAN| BI_T1_OWNER |        |            |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - storage(("OWNER" LIKE '%XYZXYZ%' AND "OWNER" IS NOT NULL))
       filter(("OWNER" LIKE '%XYZXYZ%' AND "OWNER" IS NOT NULL))

So, as you see the execution plan sure shows a FAST FULL SCAN on a BITMAP INDEX segment, which happens to be on Exadata STORAGE.

Also, you see a storage() predicate applied on the line 3 of the execution plan, which means that Oracle will attempt to use a smart scan predicate offload – but this can’t always be done!

So, you can’t really determine whether a smart scan happened during execution just by looking into the execution plan, you should really check some V$SESSION statistics too. That’s where my Snapper script becomes handy.

I started Snapper on my session just before running the above query. The “smart table scan” and “smart index scan” performance counters are updated right after Oracle has opened the segment header and determines, from the number of blocks in the segment, whether to call the smart scan codepath or not. In other words, the smart scan counters are inremented in the beginning of the segment scan.

The output is following (some irrelevant counters are stripped for brevity):


@snapper all 5 1 "301"
Sampling SID 301 with interval 5 seconds, taking 1 snapshots...
setting stats to all due to option = all

-- Session Snapper v3.52 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

-------------------------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                                                 ,     HDELTA, HDELTA/SEC,    %TIME, GRAPH
-------------------------------------------------------------------------------------------------------------------------------------
    301, TANEL     , STAT, physical read total IO requests                           ,         13,        2.6,
    301, TANEL     , STAT, physical read total multi block requests                  ,          4,         .8,
    301, TANEL     , STAT, physical read requests optimized                          ,          1,         .2,
    301, TANEL     , STAT, physical read total bytes optimized                       ,      8.19k,      1.64k,
    301, TANEL     , STAT, physical read total bytes                                 ,      4.63M,     925.7k,
    301, TANEL     , STAT, cell physical IO interconnect bytes                       ,     10.02k,         2k,
    301, TANEL     , STAT, physical reads                                            ,        565,        113,
    301, TANEL     , STAT, physical reads cache                                      ,          1,         .2,
    301, TANEL     , STAT, physical reads direct                                     ,        564,      112.8,
    301, TANEL     , STAT, physical read IO requests                                 ,         13,        2.6,
    301, TANEL     , STAT, physical read bytes                                       ,      4.63M,     925.7k,
    301, TANEL     , STAT, db block changes                                          ,          1,         .2,
    301, TANEL     , STAT, cell physical IO bytes eligible for predicate offload     ,      4.62M,    924.06k,
    301, TANEL     , STAT, cell physical IO interconnect bytes returned by smart scan,      1.82k,      364.8,
    301, TANEL     , STAT, cell blocks processed by cache layer                      ,        564,      112.8,
    301, TANEL     , STAT, cell blocks processed by txn layer                        ,        564,      112.8,
    301, TANEL     , STAT, cell blocks processed by index layer                      ,        564,      112.8,
    301, TANEL     , STAT, cell blocks helped by minscn optimization                 ,        564,      112.8,
    301, TANEL     , STAT, cell index scans                                          ,          1,         .2,
    301, TANEL     , STAT, index fast full scans (full)                              ,          1,         .2,
    301, TANEL     , STAT, index fast full scans (direct read)                       ,          1,         .2,
    301, TANEL     , STAT, bytes sent via SQL*Net to client                          ,        334,       66.8,
    301, TANEL     , STAT, bytes received via SQL*Net from client                    ,        298,       59.6,
    301, TANEL     , STAT, SQL*Net roundtrips to/from client                         ,          2,         .4,
    301, TANEL     , STAT, cell flash cache read hits                                ,          1,         .2,
    301, TANEL     , TIME, hard parse elapsed time                                   ,     1.17ms,    233.8us,      .0%, |          |
    301, TANEL     , TIME, parse time elapsed                                        ,      1.5ms,    300.2us,      .0%, |          |
    301, TANEL     , TIME, DB CPU                                                    ,       11ms,      2.2ms,      .2%, |          |
    301, TANEL     , TIME, sql execute elapsed time                                  ,     82.2ms,    16.44ms,     1.6%, |@         |
    301, TANEL     , TIME, DB time                                                   ,    84.36ms,    16.87ms,     1.7%, |@         |
    301, TANEL     , WAIT, enq: KO - fast object checkpoint                          ,    16.18ms,     3.24ms,      .3%, |          |
    301, TANEL     , WAIT, gc cr grant 2-way                                         ,      223us,     44.6us,      .0%, |          |
    301, TANEL     , WAIT, gc current grant 2-way                                    ,      136us,     27.2us,      .0%, |          |
    301, TANEL     , WAIT, cell smart index scan                                     ,    56.04ms,    11.21ms,     1.1%, |@         |
    301, TANEL     , WAIT, SQL*Net message to client                                 ,        7us,      1.4us,      .0%, |          |
    301, TANEL     , WAIT, SQL*Net message from client                               ,      4.42s,   884.47ms,    88.4%, |@@@@@@@@@ |
    301, TANEL     , WAIT, cell single block physical read                           ,      541us,    108.2us,      .0%, |          |
    301, TANEL     , WAIT, events in waitclass Other                                 ,     2.22ms,    443.2us,      .0%, |          |
--  End of Stats snap 1, end=2011-03-13 19:36:31, seconds=5

As you see from the above “cell index scans” statistic – indeed one index segment was scanned using the cell smart scan method.

So, I would rather call this feature “smart segment scan” to reflect that smart scan can scan more than just tables…

I guess one of the reasons why few people have seen smart bitmap index scans in action is that (single-column) bitmap indexes tend to be small. Smaller than corresponding table segments and B-tree index segments. On partitioned tables they’re much more likely going to be under the “_small_table_threshold” calculation which is used for determining whether to do a direct path full segment scan or not (yes, the _small_table_threshold applies to fast full index scan and fast full bitmap index scan too, not just table scans). So, it’s likely that Oracle chooses to do a regular, buffered full bitmap segment scan and thus won’t even consider using smart scan (as smart scans require direct path reads).

By the way – the direct path read (or not) decision is done per segment – not per object (like a table or index). So if you have 10 partitions in a table (or index), half of them are large, half are smaller, then Oracle may end up using direct path reads (and smart scan) on 5 of them and buffered (dumb) scan on the other 5. If you run something like Snapper on the session, then you’d see the smart scan counters go up by 5 only. As written above, Oracle decides whether to do direct path reads (and smart scan) right after opening the header block of a segment (partition) and reading out how many blocks this partition’s segment has below HWM.

The above applied to serial direct path reads – the Parallel Execution slaves should always read using direct path mode, right? …. Wrong :)

Well, partially wrong… In 11.2.0.2, if the parallel_degree_policy = manual, then yes, PX slaves behave like usual and always force a direct path read (and try to use a smart scan). However, with parallel_degree_policy = AUTO, which is the future of PX auto-management, Oracle can decide to do a buffered parallel scan instead, again disabling the use of smart scan…

One more note – I didn’t say anything about whether you should or should not use (bitmap) indexes on Exadata, it’s an entirely different discussion. I just brought out that the smart scan is used for scanning table segments, B-tree index segments and bitmap index segments if conditions are right.

And in the end I have to say…. that even with this evidence you can’t be fully sure that a smart scan was used throughout the entire segment, but more about this in the book and perhaps in a later blog article. We have interesting times ahead ;-)

Share

Exadata Training – I’ll be speaking at the 1-day UKOUG Exadata Special Event on 18th April

Hi all,

As my frequent readers know, I have promised to not travel anymore as it’s just too much hassle compared to the benefit of being “there”. This is why I’m going to fly to London on Monday, 18th April to speak at the UKOUG Exadata Special Event. This event is just too sexy to be missed, so I made an exception (the last one, I promise!)… and it’s probably going to be warmer there as well compared to where I am now :-)

I will be talking about what’s been my focus area for last year or so – Oracle Exadata Performance.

Dan Norris and Alex Gorbachev will be speaking there too, so it should end up being a pretty awesome event!

More details here:

My abstract is following:

#000000; font-family: Times; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;">#333333; font-family: arial,sans-serif; text-align: left; font-size: small;">Understanding Exadata Performance: Metrics and Wait Events
#000000; font-family: Times; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;">#333333; font-family: arial,sans-serif; text-align: left; font-size: small;">

In order to systematically troubleshoot and optimize Exadata performance, one must understand the meaning of its performance metrics.

This session provides a deep technical walkthrough of how Exadata IO and smart scans work and how to use relevant metrics for troubleshooting related performance issues. We will review both Exadata database and cell-level metrics, cell wait events and tools useful for troubleshooting. We will also look into metrics related to Exadata Hybrid Columnar Compression and the cell Flash Cache usage.

P.S. The reason why I called this post “Exadata Training” is that you’ll learn some real world practical stuff there… as opposed to the marketing material (and marketing material copy material) overdose out there… ;-)

Share

Oracle Exadata Performance series – Part 1: Should I use Hugepages on Linux Database Nodes?

There was a question in LinkedIn forum about whether Linux Hugepages should be used in Oracle Exadata Database layer, as they aren’t enabled by default during ACS install. I’m putting my answer into this blog entry – apparently LinkedIn forums have a limit of 4000 characters per reply… (interestingly familiar number, by the way…:)

So, I thought that it’s time to start writing my Oracle Exadata Performance series articles what I’ve planned for a while… with some war stories from the field, some stuff what I’ve overcome when researching for writing the Expert Oracle Exadata book etc.

I’ve previously published an article about Troubleshooting Exadata Smart Scan performance and some slides from my experience with VLDB Data Warehouse migrations to Exadata.

Here’s the first article (initially planned as a short response in LinkedIn, but it turned out much longer though):

As far as I’ve heard, the initial decision to not enable hugepages by default was that the hugepages aren’t flexible & dynamic enough – you’ll have to always configure the hugepages at OS level to match your desired SGA size (to avoid wastage). So, different shops may want radically different SGA sizes (larger SGA for single-block read oriented databases like transactional/OLTP or OLAP cubes), but smaller SGA for smart scan/parallel scan oriented DWs. If you configure 40GB of hugepages on a node, but only use 1GB of SGA, then 39GB memory is just reserved, not used, wasted – as hugepages are pre-allocated. AMM, using regular pages, will only use the pages what it touches, so there’s no memory wastage due to any pre-allocation issues…

So, Oracle chose to use an approach which is more universal and doesn’t require extra OS level configuration (which isn’t hard at all though if you pay attention, but not all people do). So, less people will end up in trouble with their first deployments although they might not be getting the most out of their hardware.

However, before enabling hugepages “because it makes things faster” you should ask yourself what exact benefit would they bring you?

There are 3 main reasons why hugepages may be useful in Linux:

1) Smaller kernel memory usage thanks to less PTEs thanks to larger pagesizes

This means less pagetable entries (PTEs) and less kernel memory usage. The bigger your SGA and the more processes you have logged on, the bigger the memory usage.

You can measure this in your case – just “grep Page /proc/meminfo” and see how big portion of your RAM has been used by “PageTables”. Many people have blogged about this, but Kevin Closson’s blog is probably the best source to read about this:

2) Lower CPU usage due to less TLB misses in CPU and soft page-fault processing when accessing SGA.

It’s harder to measure this on Linux with standard tools, although it is sure possible (on Solaris you can just run prstat -m to get microstate accounting and look into TFL,DFL,TRP stats).

Anyway, the catch here is that if you are running parallel scans and smart scans, then you don’t access that much of buffer cache in SGA at all, all IOs or smart scan result-sets are read directly to PGAs of server processes – which don’t use large pages at all, regardless of whether hugepages for SGA have been configured or not. There are some special cases, like when a block clone has to be rolled back for read consistency, you’ll have to access some undo blocks via buffer cache… but again this should be a small part of total workload.

So, in a DW, which using mostly smarts scans or direct path reads, there won’t be much CPU efficiency win from large pages as you bypass buffer cache anyway and use small pages of private process memory. All the sorting, hashing etc all happens using small pages anyway. Again I have to mention that on (my favorite OS) Solaris it is possible to configure even PGAs to use large pages (via _realfree_heap_pagesize_hint parameter) … so it’ll be interesting to see how this would help DW workloads on the Exadata X2-8 monsters which can run Solaris 11.

3) Lock SGA pages into RAM so they won’t be paged out when memory shortage happens (for whatever reason).

Hugepages are pre-allocated and never paged out. So, when you have extreme memory shortage, your SGAs won’t be paged out “by accident”. Of course it’s better to ensure that such memory shortages won’t happen – configure the SGA/PGA_AGGREGATE_TARGET sizes properly and don’t allow third party programs consume crazy amounts of memory etc. Of course there’s the lock_sga parameter in Oracle which should allow to do this on Linux with small pages too, but first I have never used it on Linux so I don’t know whether it works ok at all and also in 11g AMM perhaps the mlock() calls aren’t supported on the /dev/shm files at all (haven’t checked and don’t care – it’s better to stay away from extreme memory shortages). Read more about how the AMM MEMORY_TARGET (/dev/shm) works from my article written back in 2007 when 11g came out ( Oracle 11g internals – Automatic Memory Management ).

So, the only realistic win (for DW workload) would be the reduction of kernel pagetables structure size – and you can measure this using PageTables statistic in /proc/meminfo. Kevin demonistrated in his article that 500 connections to an instance with ~8 GB SGA consisting of small pages resulted in 7 GB of kernel pagetables usage, while the usage with large pages (still 500 connections, 8 GB SGA) was about 265 MB. So you could win over 6 GB of RAM, which you can then give to PGA_AGGREGATE_TARGET or to further inrease SGA. The more processes you have connected to Oracle, the more pagetable space is used… Similarly, the bigger the SGA is, the more pagetable space is used…

This is great, but the tradeoff here is manageability and some extra effort you have to put in to always check whether the large pages actually got used or not. After starting up your instance, you should really check whether the HugePages_Free in /proc/meminfo shrank and HugePages_Rsvd increased (when instance has just started up and Oracle hasn’t touched all the SGA pages yet, some pages will show up as Rsvd – reserved).

With a single instance per node this is trivial – you know how much SGA you want and pre-allocate the amount of hugepages for that. If you want to increase the SGA, you’ll have to shut down the instance and increase the Linux hugepages setting too. This can be done dynamically by issuing a command like echo N > /proc/sys/vm/nr_hugepages (where N is the number of huge pages), BUT in real life this may not work out well as if Linux kernel can’t free enough small pages from right physical RAM locations to consolidate 2 or 4 MB contiguous pages, the above command may fail to create the requested amount of new hugepages.

And this means you should restart the whole node to do the change. Note that if you increase your SGA larger to the number of hugepages (or you forget to increase the memlock setting in /etc/security/limits.conf accordingly) then your instance will silently just use the small pages, while all the memory pre-allocated for hugepages stays reserved for hugepages and is not usable for anything else!).

So, this may become more of a problem when you have multiple database instances per cluster node or you expect to start up and shut down instances on different nodes based on demand (or when some cluster nodes fail).

Long story short – I do configure hugepages in “static” production environments, to save kernel memory (and some CPU time for OLTP type environments using buffer cache heavily), also on Exadata. However for various test and development environments with lots of instances per server and constant action, I don’t bother myself (and the client) with hugepages and make everyone’s life easier… Small instances with small number of connections won’t use that many PTEs anyway…

For production environments with multiple database instances per node (and where failovers are expected) I would take the extra effort to ensure that whatever hugepages I have preallocated, won’t get silently wasted because an instance wants more SGA than the available hugepages can accommodate. You can do this by monitoring /proc/meminfo’s HugePage entries as explained above. And remember, the ASM instance (which is started before DB instances) will also grab itself some hugepages when it starts!

Share

Expert Oracle Exadata Book Available as Part of Apress Alpha Program

I’ve had several inquiries about whether our upcoming Exadata book is part of the “Alpha” program at Apress. Honestly, I wasn’t even familiar with the program so I asked our editor, Jonathan Gennick, and found out that our book is part of the program. Tanel already did a post explaining how it works here:

I just wanted to follow up with a quick post since I’d had a few questions about it already. It’s actually pretty cool if you are anxious to get your hands on the material. It’s also interesting because you can see how the book writing process goes. You basically have access to draft versions of chapters as the authors turn them in – warts and all. Then you’ll get emails when the chapters are updated as they go through the reviewing and editing process. Here’s a link to the Apress page where you can see all the details:

So welcome to the future where everyone can see everything you’re doing. Next thing you know we’ll all be wearing jet packs.

Expert Oracle Exadata book – Alpha chapters available for purchase!

Hi,

Apress has made the draft versions of our Expert Oracle Exadata book available for purchase.

How this works is:

  1. You purchase the “alpha” version of the Expert Oracle Exadata book
  2. You get the access to draft/alpha PDF versions of some chapters now!
  3. As more chapters will be added and existing ones updated, you’ll receive an email and you can download these too
  4. You will get a PDF copy of the final book once it’s out!

This is an awesome deal if you can’t wait until the final launch and want to get ahead of the curve with your Exadata skills ;-)

Buy the alpha version of our Expert Oracle Exadata book from Apress here!

If you haven’t heard about this book earlier – I’m one of the 3 authors, writing it together with Kerry Osborne and Randy Johnson from Enkitec and our official tech reviewer is no other than THE Kevin Closson and we are also getting some (unofficial) feedback from Oracle database junkie Arup Nanda.

So this book will absolutely rock and if you want a piece of it now, order the alpha book above!

P.S. This hopefully also explains why I’ve been so quiet with my blogging lately – can’t write a book and do many other things at the same time… (at least if you want to do it well…)

Share

Exadata Serial Numbers

Just a quick post (mostly so I can find the info quickly myself when I want it). Here’s how to get the master serial number for an Exadata Rack:

ipmitool sunoem cli ‘show /SP system_identifier’

Apparently by convention the Exadata Rack serial number is appended to the end of ILOM system identifier string. Here’s an example:

[enkdb01:root] /root 
> ipmitool sunoem cli 'show /SP system_identifier'
Connected. Use ^D to exit.
-> show /SP system_identifier
 
  /SP
    Properties:
        system_identifier = Sun Oracle Database Machine _YOUR_SERIAL_NUMBER_HERE_
 
 
-> Session closed
Disconnected

Thanks to Dan Norris for that bit of info.

You can also get the individual serial numbers for each component in an Exadata Rack like this:

/opt/oracle.SupportTools/CheckHWnFWProfile -S

So you would log on to each machine and run this. Here’s the output from one of the DB Servers in our system:

 
> /opt/oracle.SupportTools/CheckHWnFWProfile -S
 
====START SERIAL NUMBERS====
==Motherboard, from dmidecode==
--System serial--
1019XF5059            
--Motherboard serial--
0328MSL-10144H0XE2    
--Chassis serial--
1019XF5059            
==Infiniband HCA==
ID: qFalcon QDR
PN: 375-3606-02          
EC: 50
SN: 1388FMH-1005500316      
V0: PCIe Gen2 x8    
==InfiniBand Switch==
 Use the /opt/oracle.SupportTools/CheckSWProfile.sh -S option to get the switch serial numbers
==Disk Controller==
Product Name    : LSI MegaRAID SAS 9261-8i
Serial No       : SV00800405
==Disks==
Enclosure:Slot Type Make Model Serial
252:0 SAS HITACHI H103014SCSUN146G 1017FAJW9E
252:1 SAS HITACHI H103014SCSUN146G 1017FBAY8E
252:2 SAS HITACHI H103014SCSUN146G 1017FAK0BE
252:3 SAS HITACHI H103014SCSUN146G 1017FB8PHE
==Motherboard, RAM etc from ipmitool==
FRU Device Description : Builtin FRU Device (LUN 0, ID 0)
 Product Manufacturer  : SUN MICROSYSTEMS
 Product Name          : ILOM INTEGRATED SP  
 
FRU Device Description : /SYS (LUN 0, ID 3)
 Board Product         : ASSY,MOTHERBOARD,X4170/X4270/X4275  
 Board Serial          : 0328MSL-10144H0XE2
 Board Part Number     : 501-7917-11
 Board Extra           : 54
 Board Extra           : X4170/4270/4275
 Product Manufacturer  : SUN MICROSYSTEMS
 Product Name          : SUN FIRE X4170 SERVER   
 Product Part Number   : 602-4927-02
 Product Serial        : 1019XF5059
 Product Extra         : 080020FFFFFFFFFFFFFF00212883E184
 
FRU Device Description : MB (LUN 0, ID 4)
 Board Product         : ASSY,MOTHERBOARD,X4170/X4270/X4275  
 Board Serial          : 0328MSL-10144H0XE2
 Board Part Number     : 501-7917-11
 Board Extra           : 54
 Board Extra           : X4170/4270/4275
 Product Manufacturer  : SUN MICROSYSTEMS
 Product Name          : SUN FIRE X4170 SERVER   
 Product Part Number   : 602-4927-02
 Product Serial        : 1019XF5059
 Product Extra         : 080020FFFFFFFFFFFFFF00212883E184
 
FRU Device Description : MB/BIOS (LUN 0, ID 5)
 Product Manufacturer  : AMERICAN MEGATRENDS 
 Product Name          : SYSTEM BIOS 
 Product Part Number   : AMIBIOS8
 Product Version       : 07060223
 
FRU Device Description : MB/NET0 (LUN 0, ID 43)
 Product Manufacturer  : INTEL   
 Product Name          : GIGABIT ETHERNET CONTROLLERS
 Product Part Number   : 82575EB
 Product Serial        : 00:21:28:83:E1:80
 Product Extra         : 01
 Product Extra         : 00:21:28:83:E1:80
 
FRU Device Description : mb.net0.fru (LUN 0, ID 43)
 Product Manufacturer  : INTEL   
 Product Name          : GIGABIT ETHERNET CONTROLLERS
 Product Part Number   : 82575EB
 Product Serial        : 00:21:28:83:E1:80
 Product Extra         : 01
 Product Extra         : 00:21:28:83:E1:80
 
FRU Device Description : MB/NET1 (LUN 0, ID 44)
 Product Manufacturer  : INTEL   
 Product Name          : GIGABIT ETHERNET CONTROLLERS
 Product Part Number   : 82575EB
 Product Serial        : 00:21:28:83:E1:81
 Product Extra         : 01
 Product Extra         : 00:21:28:83:E1:81
 
FRU Device Description : mb.net1.fru (LUN 0, ID 44)
 Product Manufacturer  : INTEL   
 Product Name          : GIGABIT ETHERNET CONTROLLERS
 Product Part Number   : 82575EB
 Product Serial        : 00:21:28:83:E1:81
 Product Extra         : 01
 Product Extra         : 00:21:28:83:E1:81
 
FRU Device Description : MB/NET2 (LUN 0, ID 45)
 Product Manufacturer  : INTEL   
 Product Name          : GIGABIT ETHERNET CONTROLLERS
 Product Part Number   : 82575EB
 Product Serial        : 00:21:28:83:E1:82
 Product Extra         : 01
 Product Extra         : 00:21:28:83:E1:82
 
FRU Device Description : mb.net2.fru (LUN 0, ID 45)
 Product Manufacturer  : INTEL   
 Product Name          : GIGABIT ETHERNET CONTROLLERS
 Product Part Number   : 82575EB
 Product Serial        : 00:21:28:83:E1:82
 Product Extra         : 01
 Product Extra         : 00:21:28:83:E1:82
 
FRU Device Description : MB/NET3 (LUN 0, ID 46)
 Product Manufacturer  : INTEL   
 Product Name          : GIGABIT ETHERNET CONTROLLERS
 Product Part Number   : 82575EB
 Product Serial        : 00:21:28:83:E1:83
 Product Extra         : 01
 Product Extra         : 00:21:28:83:E1:83
 
FRU Device Description : mb.net3.fru (LUN 0, ID 46)
 Product Manufacturer  : INTEL   
 Product Name          : GIGABIT ETHERNET CONTROLLERS
 Product Part Number   : 82575EB
 Product Serial        : 00:21:28:83:E1:83
 Product Extra         : 01
 Product Extra         : 00:21:28:83:E1:83
 
FRU Device Description : /UUID (LUN 0, ID 6)
 Product Extra         : 080020FFFFFFFFFFFFFF00212883E184
 
FRU Device Description : SP/NET0 (LUN 0, ID 1)
 Product Manufacturer  : ASPEED  
 Product Name          : ETHERNET CONTROLLER 
 Product Part Number   : AST2100
 Product Serial        : 00:21:28:83:e1:84
 Product Extra         : 01
 Product Extra         : 00:21:28:83:e1:84
 
FRU Device Description : SP/NET1 (LUN 0, ID 2)
 Product Manufacturer  : ASPEED  
 Product Name          : ETHERNET CONTROLLER 
 Product Part Number   : AST2100
 Product Serial        : 00:21:28:83:e1:85
 Product Extra         : 01
 Product Extra         : 00:21:28:83:e1:85
 
FRU Device Description : MB/P0 (LUN 0, ID 16)
 Product Manufacturer  : INTEL   
 Product Name          : INTEL(R) XEON(R) CPU           E5540  @ 2.53GHZ 
 Product Part Number   : 060A
 Product Version       : 05
 
FRU Device Description : MB/P1 (LUN 0, ID 17)
 Product Manufacturer  : INTEL   
 Product Name          : INTEL(R) XEON(R) CPU           E5540  @ 2.53GHZ 
 Product Part Number   : 060A
 Product Version       : 05
 
FRU Device Description : MB/P0/D0 (LUN 0, ID 24)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 1A86965A
 
FRU Device Description : MB/P0/D1 (LUN 0, ID 25)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 04B25956
 
FRU Device Description : MB/P0/D2 (LUN 0, ID 26)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 0462594F
 
FRU Device Description : MB/P0/D3 (LUN 0, ID 27)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 04125951
 
FRU Device Description : MB/P0/D4 (LUN 0, ID 28)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 04525976
 
FRU Device Description : MB/P0/D5 (LUN 0, ID 29)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 04525964
 
FRU Device Description : MB/P0/D6 (LUN 0, ID 30)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 0432597B
 
FRU Device Description : MB/P0/D7 (LUN 0, ID 31)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 0422595A
 
FRU Device Description : MB/P0/D8 (LUN 0, ID 32)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 0432595A
 
FRU Device Description : MB/P1/D0 (LUN 0, ID 33)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 2D634EE4
 
FRU Device Description : MB/P1/D1 (LUN 0, ID 34)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 2DC34EA7
 
FRU Device Description : MB/P1/D2 (LUN 0, ID 35)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 2D234E9F
 
FRU Device Description : MB/P1/D3 (LUN 0, ID 36)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 2D634E94
 
FRU Device Description : MB/P1/D4 (LUN 0, ID 37)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 2D534EE1
 
FRU Device Description : MB/P1/D5 (LUN 0, ID 38)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 2D334EE3
 
FRU Device Description : MB/P1/D6 (LUN 0, ID 39)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 1A76965D
 
FRU Device Description : MB/P1/D7 (LUN 0, ID 40)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 1AB6965B
 
FRU Device Description : MB/P1/D8 (LUN 0, ID 41)
 Product Manufacturer  : HYNIX SEMICONDUCTOR INC.
 Product Name          : 4GB DDR3 SDRAM 666  
 Product Part Number   : HMT151R7BFR4C-H9  
 Product Version       : 4437
 Product Serial        : 04625970
 
FRU Device Description : PCIE1/F20CARD (LUN 0, ID 225)
 Device not present (Requested sensor, data, or record not found)
 
FRU Device Description : PCIE2/F20CARD (LUN 0, ID 226)
 Device not present (Requested sensor, data, or record not found)
 
FRU Device Description : PS0 (LUN 0, ID 63)
 Board Mfg             : EMERSON
 Board Product         : A237
 Board Serial          : 1357ZHO-1012BF1NLY
 Board Part Number     : 300-2143-04
 Power Supply Record
  Capacity                   : 760 W
  Peak VA                    : 880 VA
  Inrush Current             : 25 A
  Inrush Interval            : 200 ms
  Input Voltage Range 1      : 90-264 V
  Input Voltage Range 2      : 0-0 V
  Input Frequency Range      : 47-63 Hz
  A/C Dropout Tolerance      : 12 ms
  Flags                      : 'Predictive fail' 'Power factor correction' 'Hot swap' 'Failure on pin de-assertion'
  Peak capacity              : 760 W
  Peak capacity holdup       : 15 s
  Combined capacity          : not specified
  Fan lower threshold        : 0 RPS
 DC Output Record
  Output Number              : 2
  Standby power              : No
  Nominal voltage            : 12.00 V
  Max negative deviation     : 11.88 V
  Max positive deviation     : 12.12 V
  Ripple and noise pk-pk     : 100 mV
  Minimum current draw       : 0.050 A
  Maximum current draw       : 6.230 A
 DC Output Record
  Output Number              : 2
  Standby power              : Yes
  Nominal voltage            : 3.30 V
  Max negative deviation     : 3.20 V
  Max positive deviation     : 3.40 V
  Ripple and noise pk-pk     : 100 mV
  Minimum current draw       : 0.000 A
  Maximum current draw       : 0.360 A
 
FRU Device Description : PS1 (LUN 0, ID 64)
 Board Mfg             : EMERSON
 Board Product         : A237
 Board Serial          : 1357ZHO-1010BF1LGG
 Board Part Number     : 300-2143-04
 Power Supply Record
  Capacity                   : 760 W
  Peak VA                    : 880 VA
  Inrush Current             : 25 A
  Inrush Interval            : 200 ms
  Input Voltage Range 1      : 90-264 V
  Input Voltage Range 2      : 0-0 V
  Input Frequency Range      : 47-63 Hz
  A/C Dropout Tolerance      : 12 ms
  Flags                      : 'Predictive fail' 'Power factor correction' 'Hot swap' 'Failure on pin de-assertion'
  Peak capacity              : 760 W
  Peak capacity holdup       : 15 s
  Combined capacity          : not specified
  Fan lower threshold        : 0 RPS
 DC Output Record
  Output Number              : 2
  Standby power              : No
  Nominal voltage            : 12.00 V
  Max negative deviation     : 11.88 V
  Max positive deviation     : 12.12 V
  Ripple and noise pk-pk     : 100 mV
  Minimum current draw       : 0.050 A
  Maximum current draw       : 6.230 A
 DC Output Record
  Output Number              : 2
  Standby power              : Yes
  Nominal voltage            : 3.30 V
  Max negative deviation     : 3.20 V
  Max positive deviation     : 3.40 V
  Ripple and noise pk-pk     : 100 mV
  Minimum current draw       : 0.000 A
  Maximum current draw       : 0.360 A
 
FRU Device Description : DBP (LUN 0, ID 210)
 Board Product         : BD,ASY,1U,8DSK BKPL,SATA
 Board Serial          : C00UHC
 Board Part Number     : 511-1283-03
 Board Extra           : 50
 Board Extra           : SASBP
 
FRU Device Description : PDB (LUN 0, ID 211)
 Board Product         : ASY,PDB,H1M1G1  
 Board Serial          : F06TJC
 Board Part Number     : 501-7696-10
 Board Extra           : 50
 Board Extra           : PDB
 Product Part Number   : 602-4927-02
 Product Serial        : 1019XF5059
 
FRU Device Description : PADCRD (LUN 0, ID 222)
 Board Product         : BD,ASY,PADDLE,1U
 Board Serial          : 0226LHF-101000049B
 Board Part Number     : 541-3512-02
 Board Extra           : 51
 Board Extra           : 1U Paddle
 
FRU Device Description : FB0 (LUN 0, ID 212)
 Board Product         : ASY,FAN,BOARD,H1M1G1
 Board Serial          : 0226LHF-1010A402TW
 Board Part Number     : 541-2112-04
 Board Extra           : 52
 Board Extra           : FANBD
 
FRU Device Description : FB1 (LUN 0, ID 213)
 Board Product         : ASY,FAN,BOARD,H1M1G1
 Board Serial          : 0226LHF-1010A402TH
 Board Part Number     : 541-2112-04
 Board Extra           : 52
 Board Extra           : FANBD
 
FRU Device Description : DBP/HDD0 (LUN 0, ID 47)
 Device not present (Requested sensor, data, or record not found)
 
FRU Device Description : DBP/HDD1 (LUN 0, ID 48)
 Device not present (Requested sensor, data, or record not found)
 
FRU Device Description : DBP/HDD2 (LUN 0, ID 49)
 Device not present (Requested sensor, data, or record not found)
 
FRU Device Description : DBP/HDD3 (LUN 0, ID 50)
 Device not present (Requested sensor, data, or record not found)
 
FRU Device Description : DBP/HDD4 (LUN 0, ID 51)
 Device not present (Requested sensor, data, or record not found)
 
FRU Device Description : DBP/HDD5 (LUN 0, ID 52)
 Device not present (Requested sensor, data, or record not found)
 
FRU Device Description : DBP/HDD6 (LUN 0, ID 53)
 Device not present (Requested sensor, data, or record not found)
 
FRU Device Description : DBP/HDD7 (LUN 0, ID 54)
 Device not present (Requested sensor, data, or record not found)
 
====END SERIAL NUMBERS====
 
[enkdb01:root] /root


(UPDATE: the command mentioned by davidr in the comments gives the serial number for the server you are connected to, be it DB or Storage): The command is:

         ipmitool sunoem cli "show /SYS product_serial_number"

Database Machine 3D tours

Somehow I missed a post in the Oracle Database Insider blog with the links to 3D tours of the latest Database Machines – X2-2 and X2-8. I must say they’ve impressed me. Not by the amount of technical details these simple tours have, but rather the way they present data. It’s easy. It’s understandable. It’s absolutely not Oracle style. If you are following Oracle’s press releases you know what I mean. They suck. I spend time reading press releases to get WTF marketing people want to say. Go here for example and tell me what is the configuration of new servers in s/c/t.

Exatadata Book 2

I have been getting quite a few questions about our upcoming Exadata Book lately so I thought I would post a quick update. We are working feverishly on it so please give us a break!

Just kidding!

I am actually feeling like we can see the light at the end of the tunnel now. We are well past the half way mark and I am feeling confident about the content. Well more than confident actually. I think it’s going to be awesome! In large part that’s due to the fact that I feel like we have the Dream Team working on the project. Tanel Poder has signed on as a co-author. Kevin Closson is the Official Technical Reviewer (and we’re tentatively planning on including a number of his comments in the book – in little “Kevin Says” sidebars). As one of the main architects of Exadata, this should provide some interesting perspective. Arup Nanda has volunteered as an unofficial technical reviewer as well. I have to say that Arup has been a great help. And I really appreciate him providing another perspective on what we’re writing about. All three of these guys are fellow Oak Table bretheren, by the way. Randy Johnson is the other co-author, and although he generally prefers to keep a low profile, he is extremely knowledgeable on things that the rest of us don’t deal with that much on a day to day basis, namely backup and recovery and storage configuration. He has a great RAC and ASM background as well. I have to also say that a guy none of you has ever heard of (Andy Colvin) has been a huge help as well. He is our in-house Exadata patching guru. Without him I’m not sure we would have been able to do the necessary testing to complete the book.

I must say that I feel honored to be involved in a project with such an accomplished group of guys. And by the way, we have had numerous offers from people that I have a lot of respect for to help with various aspects of the project. I want to thank all of you for those offers, even if we haven’t taken you up on all of them (our little brains can only absorb so much feedback at any one time). The book is actually available for pre-order on the Amazon already (so someone must think we are actually going to finish it pretty soon). I think we’re  right on track for later spring delivery. :-)

storage(:Z>=:Z AND :Z

Ever wonder why that weird line shows up in the Predicate Section of an execution plan on Exadata? Me too! The STORAGE bit tells us it’s a filter applied at the storage cell layer, but the rest is nonsensical. Well I recently ran across a very brief mention of it in a Metalink note. (I know it’s not called Metalink any more, but I’m kind of set in my ways). The note said it was related to distribution of rows to PX slaves. Ah ha! Let’s test it out. Here’s a plan with the predicate just so you can see what it looks like.

KSO@arcp> @dplan
Enter value for sql_id: a9axwj6ym3b29
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  a9axwj6ym3b29, child number 0
-------------------------------------
select /*+  parallel_index(t, "ISD_SI_I03",8)  dbms_stats
cursor_sharing_exact use_weak_name_resl dynamic_sampling(0)
no_monitoring no_substrb_pad  no_expand index_ffs(t,"ISD_SI_I03") */
count(*) as nrw,count(distinct sys_op_lbid(725425,'L',t.rowid)) as
nlb,count(distinct hextoraw(sys_op_descend("SUPPORT_LEVEL")||sys_op_desc
end("SUPPORT_LEVEL_KEY")||sys_op_descend("NAME")||sys_op_descend("VALUE"
))) as ndk,sys_op_countchg(substrb(t.rowid,1,15),1) as clf from
"ISD"."SUPPORT_INFO" sample block (   .0503530742,1)  t where
"SUPPORT_LEVEL" is not null or "SUPPORT_LEVEL_KEY" is not null or
"NAME" is not null or "VALUE" is not null
 
Plan hash value: 1766555783
 
---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |            |       |       |   672 (100)|          |        |      |            |
|   1 |  SORT GROUP BY                              |            |     1 |    56 |            |          |        |      |            |
|   2 |   PX COORDINATOR                            |            |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)                      | :TQ10001   |     1 |    56 |            |          |  Q1,01 | P->S | QC (RAND)  |
|   4 |     SORT GROUP BY                           |            |     1 |    56 |            |          |  Q1,01 | PCWP |            |
|   5 |      PX RECEIVE                             |            |     1 |    56 |            |          |  Q1,01 | PCWP |            |
|   6 |       PX SEND HASH                          | :TQ10000   |     1 |    56 |            |          |  Q1,00 | P->P | HASH       |
|   7 |        SORT GROUP BY                        |            |     1 |    56 |            |          |  Q1,00 | PCWP |            |
|   8 |         PX BLOCK ITERATOR                   |            |   397K|    21M|   672   (1)| 00:00:01 |  Q1,00 | PCWC |            |
|*  9 |          INDEX STORAGE SAMPLE FAST FULL SCAN| ISD_SI_I03 |   397K|    21M|   672   (1)| 00:00:01 |  Q1,00 | PCWP |            |
---------------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   9 - storage(:Z>=:Z AND :Z<=:Z AND ("SUPPORT_LEVEL" IS NOT NULL OR "SUPPORT_LEVEL_KEY" IS NOT NULL OR "NAME" IS NOT NULL OR
              "VALUE" IS NOT NULL))
       filter(("SUPPORT_LEVEL" IS NOT NULL OR "SUPPORT_LEVEL_KEY" IS NOT NULL OR "NAME" IS NOT NULL OR "VALUE" IS NOT NULL))
 
 
37 rows selected.

Notice that the plan is for a PX statement. So how can we convince ourselves that it is a PX related predicate. We can try two tests.

  1. Let’s see if we can find any SQL statements that have the predicate that aren’t PX statements.
  2. Let’s see if we can find any PX statements that don’t have the predicate.

So here we go.

KSO@arcp>  -- How many :Z>=:Z's are there?
KSO@arcp> select count(*) from v$sql_plan
  2  where ACCESS_PREDICATES like '%:Z>=:Z%'
  3  /
 
  COUNT(*)
----------
        73
 
1 row selected.
 
KSO@arcp>  -- How many :Z>=:Z's are there that are not PX?
KSO@arcp> select count(*) from v$sql_plan
  2  where ACCESS_PREDICATES like '%:Z>=:Z%'
  3  and sql_id not in (select sql_id from v$sql_plan where operation like 'PX%')
  4  /
 
  COUNT(*)
----------
         0
 
1 row selected.
 
KSO@arcp>  -- How many PX's don't have :Z>=:Z?
KSO@arcp> select count(distinct sql_id) from v$sql_plan
  2  where operation like 'PX%'
  3  and sql_id not in (select sql_id from v$sql_plan
  4  where ACCESS_PREDICATES like '%:Z>=:Z%')
  5  /
 
  COUNT(*)
----------
       154
 
1 row selected.
 
KSO@arcp>  -- Whoa, that's a little unexpected!

So there are none of the :Z>=:Z predicates on non-PX queries but there are a bunch of PX queries that don’t have the predicate. Let’s look at a couple of those and see why those might not have the predicate.

 
KSO@arcp> -- Get SQL_ID's for PX's that don't have :Z>=:Z
KSO@arcp> select distinct sql_id from v$sql_plan
  2  where operation like 'PX%'
  3  and sql_id not in (select sql_id from v$sql_plan
  4  where ACCESS_PREDICATES like '%:Z>=:Z%')
  5  /
 
SQL_ID
-------------
7xa3zbpgkbta7
ftkmqqq3ga0nf
6wgmq24t9xy6f
7tsf3h3qjth77
fg1aphaqvcmb3
...
bt9n0qsg8k4sb
fffrvvnrnmztg
97x3zj2fb0y5z
5dudhrch3sv8r
bbw31mhra7ryu
 
154 rows selected.
 
KSO@arcp> @dplan
KSO@arcp> set lines 150
KSO@arcp> select * from table(dbms_xplan.display_cursor('&sql_id','&child_no','typical'))
  2  /
Enter value for sql_id: bbw31mhra7ryu
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  bbw31mhra7ryu, child number 0
-------------------------------------
select MUTEX_TYPE, LOCATION, SLEEPS, WAIT_TIME from GV$MUTEX_SLEEP
where INST_ID = USERENV('INSTANCE')
 
Plan hash value: 365768877
 
----------------------------------------------------------------------------
| Id  | Operation            | Name           |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                |        |      |            |
|*  1 |  PX COORDINATOR      |                |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000       |  Q1,00 | P->S | QC (RAND)  |
|*  3 |    VIEW              | GV$MUTEX_SLEEP |  Q1,00 | PCWP |            |
|   4 |     FIXED TABLE FULL | X$MUTEX_SLEEP  |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("INST_ID"=USERENV('INSTANCE'))
   3 - filter("INST_ID"=USERENV('INSTANCE'))
 
Note
-----
   - rule based optimizer used (consider using cbo)
 
 
27 rows selected.
 
KSO@arcp> 
KSO@arcp> /
Enter value for sql_id: 5dudhrch3sv8r
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  5dudhrch3sv8r, child number 0
-------------------------------------
select  NAMESPACE , GETS , GETHITS , GETHITRATIO , PINS , PINHITS ,
PINHITRATIO , RELOADS , INVALIDATIONS , DLM_LOCK_REQUESTS ,
DLM_PIN_REQUESTS , DLM_PIN_RELEASES , DLM_INVALIDATION_REQUESTS ,
DLM_INVALIDATIONS from GV$LIBRARYCACHE where inst_id =
USERENV('Instance')
 
Plan hash value: 3363616119
 
-----------------------------------------------------------------------------
| Id  | Operation            | Name            |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                 |        |      |            |
|*  1 |  PX COORDINATOR      |                 |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000        |  Q1,00 | P->S | QC (RAND)  |
|*  3 |    VIEW              | GV$LIBRARYCACHE |  Q1,00 | PCWP |            |
|*  4 |     FIXED TABLE FULL | X$KGLST         |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("INST_ID"=USERENV('INSTANCE'))
   3 - filter("INST_ID"=USERENV('INSTANCE'))
   4 - filter((LENGTH("KGLSTDSC")<=15 AND "KGLSTGET"<>0 AND
              "KGLSTTYP"='NAMESPACE'))
 
Note
-----
   - rule based optimizer used (consider using cbo)
 
 
32 rows selected.

So both of those queries are against Fixed Tables (memory structures). Makes sense that a filter at the storage layer would not be necessary. As it turns out, all 154 of the PX queries that don’t have the :Z>=:Z predicate are queries against Fixed Tables. Here’s a quick check in case you want to try it on your own Exadata.

 
KSO@arcp> select count(*) from v$sql_plan
  2  where operation like 'PX%'
  3  and sql_id not in (select sql_id from v$sql_plan
  4  where ACCESS_PREDICATES like '%:Z>=:Z%')
  5  and sql_id not in (select sql_id from v$sql_plan
  6  where operation not like 'FIXED%')
  7  /
 
  COUNT(*)
----------
         0
 
1 row selected.

So while not a definitive proof, it does appear that the :Z>=:Z predicate is related to PX row distribution. Now if I can just figure out why the storage filters are always repeated in the XPLAN output as a standard filter for offloaded queries like in this example:

Enter value for sql_id: 0qa98gcnnza7h
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0qa98gcnnza7h, child number 0
-------------------------------------
select avg(pk_col) from kso.skew where col1 > 0
 
Plan hash value: 568322376
 
-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |       |       | 44486 (100)|          |
|   1 |  SORT AGGREGATE            |      |     1 |    11 |            |          |
|*  2 |   TABLE ACCESS STORAGE FULL| SKEW |    32M|   335M| 44486   (1)| 00:08:54 |
-----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - storage("COL1">0)
       filter("COL1">0)
 
 
20 rows selected.

Let me know if you have the answer to that one!