Search

Top 60 Oracle Blogs

Recent comments

statistics

Join Cardinality – 3

In the previous posting I listed the order of precision of histograms as:

Join Cardinality – 2

In the previous note I posted about Join Cardinality I described a method for calculating the figure that the optimizer would give for the special case where you had a query that:

Join Cardinality

Following up my “Hacking for Skew” article from a couple of days ago, Chinar Aliyev has written an article about a method for persuading the optimizer to calculate the correct cardinality estimate without using any undocumented, or otherwise dubious, mechanisms.

Hacking for Skew

In my presentation to the UKOUG SIG yesterday “Struggling with Statistics – part 2” I described a problem that I wrote about a few months ago: when you join a fact table with a massively skewed distribution on one of the surrogate key columns to a dimension holding the unique list of keys and descriptions a query against a description “loses” the skew. Here’s an demo of the problem that’s a little simpler than the one in the previous article.

Column Group Catalog

I seem to have written a number of aricles about column groups – the rather special, and most useful, variant on extended stats. To make it as easy as possible to find the right article I’ve decided to produce a little catalogue (catalog) of all the relevant articles, with a little note about the topic each article covers. Some of the articles will link to others in the list, and there are a few items in the list from other blogs. There are also a few items which are the titles of drafts which have been hanging around for the last few years.

How to gather Oracle optimizer statistics with minimal risks of regression

The ATLAS experiment control room

Here is, on the Databases at CERN blog, an example to lower to risks when you need to gather statistics in production:

  • use pending stats to be able to test them before publishing
  • be ready to restore old ones if a critical regression comes after publishing

The whole demo (I encourage you to follow the Databases at CERN blog):

https://db-blog.web.cern.ch/blog/franck-pachot/2018-09-optimizer-statistics-gathering-pending-and-history

Here is a summary of commands used, for an easy copy/paste:

Column Stats

A little while ago I added a postscript about gathering stats on a virtual column to a note I’d written five years ago and then updated with a reference to a problem on the Oracle database forum that complained that stats collection had taken much longer after the addition of a function-based index. The problem related to the fact that the function-based index was supported by a virtual column that used an instr() function on a CLOB (XML) column – and gathering stats on the virtual column meant applying the function to every CLOB in the table.

So my post-script, added about a month ago, suggested adding a preference (dbms_stats.set_table_prefs) to avoid gathering stats on that column. There’s a problem with this suggestion – it doesn’t work

Stats time

I wrote a note a couple of years ago explaining how I used to get a rough idea (with some errors) of how much time was spent in the overnight stats collection by each object. One of the nice little enhancements that appeared in 12c was the appearance of a couple of functions that can report information about this type of thing, and more. These are the dbms_stats function report_stats_operations() and report_single_stats_operation() with the following definitions:

How Not to Collect Optimizer Statistics in an Application Engine Program

I was told about a PeopleSoft customer experiencing an Oracle error when collecting statistics during an Application Engine.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">ORA-06533: Subscript beyond count 
ORA-06512: at "SYS.DBMS_STATS"…

It is possibly a manifestation of a database bug. The workaround was not to use AUTO_SAMPLE_SIZE, so instead, this customer initially coded an explicit call to DBMS_STATS.GATHER_TABLE_STATS with a specific sample size.
This blog is not about the bug, but how to manage the workaround.

Extended Histograms – 2

Following on from the previous posting which raised the idea of faking a frequency histogram for a column group (extended stats), this is just a brief demonstration of how you can do this. It’s really only a minor variation of something I’ve published before, but it shows how you can use a query to generate a set of values for the histogram and it pulls in a detail about how Oracle generates and stores column group values.

We’ll start with the same table as we had before – two columns which hold only the combinations (‘Y’, ‘N’) or (‘N’, ‘Y’) in a very skewed way, with a requirement to ensure that the optimizer provides an estimate of 1 if a user queries for (‘N’,’N’) … and I’m going to go the extra mile and create a histogram that does the same when the query is for the final possible combination of (‘Y’,’Y’).