statistics

Column Stats

I’ve made several comments in the past about the need for being selective when gathering objects statistics with particular reference to the trade-offs when creating histograms. With Oracle 12c it’s now reasonably safe (as far as I’m concerned) to set a method_opt as a table preference that identifies columns where you expect to see Frequency or (pace the buggy behaviour described in a recent post) a Top-N histograms. The biggest problem I have is that I keep forgetting the exact syntax I need – so I’ve written this note more as a reminder to myself than anything else.

Histogram Hassle

I came across a simple performance problem recently that ended up highlighting a problem with the 12c hybrid histogram algorithm. It was a problem that I had mentioned in passing a few years ago, but only in the context of Top-N histograms and without paying attention to the consequences. In fact I should have noticed the same threat in a recent article by Maria Colgan that mentioned the problems introduced in 12c by the option “for all columns size repeat”.

So here’s the context (note – all numbers used in this example are approximations to make the arithmetic obvious).  The client had a query with a predicate like the follwing:

nVision Performance Tuning: 10. Maintaining Statistics on Non-Partitioned Tree Selector Tables

This blog post is part of a series that discusses how to get optimal performance from PeopleSoft nVision reporting as used in General Ledger.

If you do not partition your selector tables, and if you predominantly use static selectors then you can maintain the statistics on the selector table as the new static selectors are created or updated.  PeopleTools table PSTREESELCTL is used to track static selectors.  It is keyed by selector number.  A row is inserted when a new selector is created, and the version number is updated when an existing selector is re-extracted.  Therefore, a trigger on this table can be used to submit a job to refresh the statistics on the tree selector.

nVision Performance Tuning: 8. Interval Partitioning and Statistics Maintenance of Tree Selector Tables

This blog post is part of a series that discusses how to get optimal performance from PeopleSoft nVision reporting as used in General Ledger.

The decision to use interval partitioning on the tree selector tables came from the need to have accurate statistics for the optimizer when parsing nVision queries.  It is not possible to introduce hints into nVision SQL. The dynamic nature of the code means it is not viable to consider any of the forms of database plan stability across the whole application, (although it may be possible to use SQL Profiles in limited cases). Therefore, as far as possible the optimizer has to choose the best plan on its own. Without accurate optimizer statistics, I have found that the optimizer will usually not choose to use a Bloom filter.

12c dbms_stats.gather_table_stats on GTT do not commit

In my UKOUG OracleScene article on 12c online statistics and GTT I mentioned the following:

A final note about those 12c changes in statistics gathering on GTT. In 11g the dbms_stats did a commit at the start. So if you did gather stats after the load, you had to set the GTT as ON COMMIT PRESERVE ROWS. Or you just vacuum what you’ve loaded. That has changed in 12c. If you now choose to do a conventional insert followed by dbms_stats (having set private stats of course) then you don’t need to set on commit preserve rows anymore.

Today, I realized that I’ve never explained exactly when dbms_stats.gather_table_stats commits the transaction or not. Because, of course, it depends. In summary: 12c non-SYS owner GTT with private statistics.

Histogram Upgrade – 2

While reading a blog post by Maria Colgan a couple of weeks ago I came across an observation about histograms that I’d not noticed before; worse still, it was a feature that seemed to make some “damage-limitation” advice I’d been giving for years a really bad idea! The threat appeared in these paragraphs:

Setting SIZE REPEAT ensures a histogram will only be created for any column that already has one. If the table is a partitioned table, repeat ensures a histogram will be created for a column that already has one on the global level.

What’s the down side to doing this?

The current number of buckets used in each histogram becomes the limit on the maximum number of buckets used for any histogram created in the future.

Interview with PeopleSoft Administrator Podcast: Cost-Based Optimizer Statistics in PeopleSoft

I recently recorded another interview with Dan Iverson and Kyle Benson for the PeopleSoft Administrator Podcast, this time about management of Cost-Based Optimizer Statistics in PeopleSoft systems.

Queue-based Concurrent Stats Prototype Implementation

This is just a prototype of a queue-based concurrent statistics implementation - using the same basic implementation I've used a a couple of years ago to create indexes concurrently.There are reasons why such an implementation might be useful - in 11.2.0.x the built-in Concurrent Stats feature might turn out to be not really that efficient by creating lots of jobs that potentially attempt to gather statistics for different sub-objects of the same table at the same time - which can lead to massive contention on Library Cache level due to the exclusive Library Cache locks required by DDL / DBMS_STATS calls.In 12.1 the Concurrent Stats feature obviously got a major re-write by using some more intelligent processing what and how should be processed concurrently - some of the details are exposed via the new view DBA_OPTSTAT_OPERATION_TASKS, but again I've seen it running lots of very

Extended Stats

After my Masterclass on indexes at the UKOUG Tech2016 conference this morning I got into a conversation about creating extended stats on a table. I had pointed out in the masterclass that each time you dropped an index you really ought to be prepared to create a set of extended stats (specifically a column group) on the list of columns that had defined the index just in case the optimizer had been using the distinct_keys statistic from the index to help it calculate cardinalities.

Histogram Upgrade

I’ve written notes in the past about the improvements 12c introduces for histograms – particularly the frequency and top-N histograms which can be generated “free of charge” while the basic “approximate NDV” scans are taking place to gather stats. Gathering histograms in 12c is much safer than it used to be in earlier versions of Oracle even in the case of the new hybrid histograms (which are still sampled on a very small sample and therefore still a stability risk).