Search

OakieTags

Who's online

There are currently 0 users and 38 guests online.

Recent comments

Affiliations

June 2011

Calculate the Distance Between Two Latitude/Longitude Points using Plain SQL

June 13, 2011 (Modified June 14, 2011) A question recently appeared on the comp.databases.oracle.server Usenet group that oddly made me recall a lesson from a mathematics class that I taught in the early 1990s.  A bit strange how a question related to Oracle Database would trigger such a memory, but it happened.  The question posed [...]

The Best Goal Ever !! (Fearless)

Australia (and Canberra specifically) had recently been suffering from two very long and difficult droughts. One had been a severe lack of rain, which left dams at record low levels. After many years, this ended earlier in the year with rain aplenty and with local dams at long last back at 100% capacity. The other drought however [...]

Oracle Open World 2011 – Suggest a Session

Well my lone abstract submission didn’t get selected at Open World this year. But apparently they have a second chance system where you can “Suggest a Session” and users can vote on which papers they’d like to see on the agenda. I went ahead and suggested “Tuning Exadata” – It sounds like something you shouldn’t have to do, but remember that Exadata is not an appliance that has few or no knobs to turn. It has all the power and options of an Oracle database and there are certainly things that you can do wrong that will keep Exadata from performing at its best. So the paper is about how you can know for sure whether Exadata is doing what it should and how to coerce it if you need to.

The mix.oracle.com site where this voting is supposed to take place is a little difficult to navigate (in my humble opinion) so here’s a direct link to the page where you can see the abstract (and vote if you deem it worthy). ;)

Tuning Exadata

You will have to log in with your Oracle Single Signon account (what you use for My Oracle Support – or Metalink for the old guys) or I think you can create an separate account if you want. By the way, Andy Colvin has submitted an abstract for a talk on Exadata Patching, which should be very informative if it gets picked. He’s done more Exadata patching than anyone I am aware of. Here’s a link to his abstract:

Rolling with the Punches – Adventures in Exadata Patching

There will undoubtedly be many deserving abstracts. For example, several of my OakTable brethren have suggested sessions as well. So please look around the site for others of interest as well. You can vote for as many as you want.

NLS, Part Deux

A guest post today, by Brian Ledbetter, a co-worker at Agilex:

On a customer’s database, we ran across a table that would not migrate.  It was admittedly a log table, containing long chunks of HTTP header data, but whenever we tried importing it into our 11gR2 database, we ended up getting:

IMP-00058: ORACLE error 1461 encountered
ORA-01461: can bind a LONG value only for insert into a LONG column

After looking at the table structure, the first thing we noticed was that there was a VARCHAR2(4000) column in the table.  Considering that this column was already the maximum size (in bytes) for a CHAR-based data type, it became the focus of our attention.

Looking online for solutions, we found references [1] suggesting that Oracle was implicitly converting this column to a VARCHAR2(4000 CHAR) type, creating a column that can contain up to 4 bytes per character.[2]  Because this overflows the 4000 byte limit on column length, Oracle then attempted to implicitly convert the datatype to a LONG VARCHAR2, which is apparently deprecated in 11gR2.[3]  (We’re not sure why Oracle is still trying to make this conversion, if that’s the case.)

Anyway, we tried precreating the table with a CLOB datatype, and that didn’t work either, so as a workaround, we created a copy of the table with the data trimmed to 1000 characters (leaving plenty of room after UTF8 conversion):

create tabname_migtmp as select col1, col2, substr(col3,1,1000) col3 from tabname;

We then used exp/imp to copy tabname_migtmp over to the 11gR2 server, and inserted the data from it into the final location.

insert into tabname select * from tabname_migtmp;

drop table tabname_migtmp;

[1] http://forums.oracle.com/forums/thread.jspa?threadID=1038043

[2] http://stackoverflow.com/questions/5230346/char-semantics-and-ora-01461

[3] http://forums.oracle.com/forums/thread.jspa?threadID=2230351

See Also: Technote 444171.1, https://supporthtml.oracle.com/ep/faces/secure/km/DocumentDisplay.jspx?id=444171.1

 

Quiz Night

Here’s an interesting question from the OTN database forum:

“If I delete 90% of the rows from a table which has a few indexes, without rebuildling or coalescing indexes afterwards, will this improve the performance of index range scans ?”

The thing that makes it interesting is the scope it gives you for imagining reasons why the performance won’t change, or might get better, or could get worse. So how about it – can you think of an argument for each of the three possibilities ?

Update – 15th June

The first thing you need to do when you see a question like this is start asking for more information – as the first comment did.

  • How big is the table – percentage are generally useless unless you also know the scale, and it’s possible that pure size might make a difference.
  • What’s the pattern of data in the table – that probably matters
  • Is the table partitioned – and does the delete pattern vary with partition
  • What’s the pattern of data across the indexes, that could make a lot of difference
  • Are any of the indexes partitioned (whether or not the table is)
  • Are all the indexes plain B-tree indexes – or bitmaps, or context, or spatial; and function-based indexes, reverse key indexes …
  • Are you deleting the data in lots of tiny batches with commits, or one big delete
  • And yes, as asked in the first comment – are you going to collect statistics before running queries, because that could make a big difference.
  • Timing is quite important – are you thinking only in the very short term (delayed block cleanout effects), or in the longer, stabilised term.
  • Are you assuming the data you delete is data that was not previously returned in the queries – but would that always matter anyway
  • And why are you asking about range scans, not general purpose queries – that rather avoids the problem of how query plans can change as stats change.

Maybe we could make some assumptions – though that can have dangerous consequences. We might guess that the driving idea behind this question is that there’s a load of old data that isn’t being queried any more and if we simply delete it our queries will, at worst, still show the same performance. So I’m going to take the line that (a) we’re expecting any queries to return the same result and (b) we’re going to be inserting new data over time.

Just picking up a few of the comments that have appeared so far we can see how many details you might have to consider to think this question properly.

Comment 1 says: “how about collecting stats” – it seems likely that new stats would appear soon (90% deletion is more than a 10% change, so if auto stats collection hasn’t been disabled …).  What if, for some indexes, 100% of the data has been deleted from 90% of the blocks, then the leaf_block count of an index will change by a factor of 10 (and that’s effectively a bug, by the way): a possible consequence is that Oracle will decide that an index fast full scan is appropriate in cases where it previously took a different path.

Comment 3 picks up the case that if you had an index based on columns where the deleted data had null values in the relevant columns, then the 90% table deletion wouldn’t affect that index at all. This may sound a little extreme – but as a minor variation on the theme, the idea of creating function-based indexes  which hold values for “recent” data is quote well-known and very effective; the argument might be totally appropriate.

Comment 5 makes a fairly subtle and important point. If I used to have a range scan that examined 1,000 rows and returned 100 and the 90% of rows I deleted was exactly the 900 rows I didn’t return then there are two outcomes. If I had to visit the table 1,000 times to identify the 100 rows then the query will now be quicker; if I had been able to identify the 100 rows by examining 1,000 index entries then the query performance will be pretty much unchanged.

Comment 5 also has a rather nice thought on stopkeys (rownum <= N) – but it’s similar to Richard Foote’s min() example in comment 4 – I think it would only apply if the result were going to change.

Comment 9 makes an interesting point. If you delete a lot of data from a table you have no idea (until you look at the data patterns carefully, or until after you’ve collected the stats) of how the clustering_factor of the index will change. It will probably drop – though technically it could stay the same – but how far it drops compared to the change in the number of rows left in the table could make a big difference to the “selectivity * clustering_factor” bit of the index range scan calculation. Bear in mind that if your queries are returning the same result before and after the delete then your selectivities must have gone up by a factor of 10 because you’re returning the same volume of data from a total volume that has decreased by a factor of 10. (I’m assuming that the optimizer gets its stats right when I say this, of course).

Comment 6 brings up an important feature of bitmap indexes. When you delete a row from a table you delete the corresponding row from a b-tree index (eventually), but you update a bitmap index chunk, which means generating a new bitmap index chunk with one bit changed. So deleting 90% of the table data, however you did it, could result in a doubling of every bitmap index on the table. (Results will vary with version of Oracle and the strategy used to delete the data)

Several comments picked up the issue of “delayed block cleanout” that’s likely to have an impact for a while after the big delete. There might be other intermittent effects later on, depending where new data goes and how many leaf blocks were completely emptied by the delete – it is possible for some odd locking waits to appear as Oracle tries to find an empty block that can legally be moved for the split.

Personally, and ignoring the bitmap index threat, I think it’s safe to say that if your indexes are very well designed (so that eliminates most systems I’ve seen), then most queries will probably perform almost as efficiently after the delete (and delayed block cleanout) as they did before - provided the optimizer still picks the same execution plan: but the  moment you collect up to date statistics you may see some staggering changes (in either direction) for some queries.  (And if you don’t collect up to date statistics at some point you will also seem some staggering changes as Oracle starts bring in “out of range” adjustments to costs.

Bottom line: If you’re going to delete 90% of the data in a large table then you’ve got to think careful about how to minimise the side effects.

 

Row Lock is Synonymous with TX Lock, and is a Lock on a Single Row of a Table – Verify the Documentation Challenge

June 10, 2011 I found an interesting quote in the Oracle Database documentation library: “Row Locks (TX) A row lock, also called a TX lock, is a lock on a single row of a table. A transaction acquires a row lock for each row modified by one of the following statements: INSERT, UPDATE, DELETE, MERGE, [...]

Queueing Theory.

So simple that even an eight year old can understand it.

Here’s a link I was sent byDominic Delmolino (another Oak Table member) a few days ago: it’s not a bad model of how things can go in an Oracle database.

For further reading, try Cary Millsap’s book – especially chapter 9.

IBM’s Adwords Typo…

I guess IBM have been buying Oracle related Adwords as part of their marketing campaign. If you go on any Oracle sites using Adsense, you might see this advert doing the rounds.

If you are trying to discredit a competitor, the least you could do is spell their name correctly. I’ll put that down as a #fail for IBM… :)

Cheers

Tim…




Les Paul’s Birthday

Check out the google page today. It would have been Les Paul’s 96th birthday.

Here’s a link (in case you can’t find google on your own).

You still have a few hours to play around with it before the next logo shows up. (maybe there will be a place to find it later)

You can strum it or use the middle row of keys to play notes. Try j k l j k h j g h.

Fedora 16 may use Btrfs by default…

Looks like Fedora 16 might use Btrfs as the default filesystem.

I hope the “Oracle doesn’t understand Open Source” brigade remember where this project started. :)

Cheers

Tim…