This note is a reminder combined with a warning about unexpected changes as you move from version to version. Since it involves LOBs (large objects) it may not be relevant for most people but since there’s a significant change in the default character set for the database as you move up to 18.3 (or maybe even as you move to 12.2) anyone using character LOBs may get a surprise.
Here’s a simple script that I’m first going to run on an instance of 11.2.0.4:
rem rem Script: lob_length_4.sql rem Author: Jonathan Lewis rem Dated: Apr 2019 rem create table t1( n1 number, version varchar2(8), v1 varchar2(4000), c1 clob, nc1 nclob ) lob (c1) store as securefile c1_lob( enable storage in row chunk 8K cache logging ) lob (nc1) store as securefile nc1_lob( enable storage in row chunk 8K cache logging ) ; insert into t1 values( 0,'General', null, empty_clob(), empty_clob() ) ; insert into t1 values( 1,'11.2.0.4', rpad('x',1951,'x'), rpad('x',1951,'x'), rpad('x',1951,'x')) ; insert into t1 values( 2,'11.2.0.4', rpad('x',1952,'x'), rpad('x',1952,'x'), rpad('x',1952,'x')) ; insert into t1 values( 3,'12.1.0.2', rpad('x',1951,'x'), rpad('x',1951,'x'), rpad('x',1951,'x')) ; insert into t1 values( 4,'12.1.0.2', rpad('x',1952,'x'), rpad('x',1952,'x'), rpad('x',1952,'x')) ; insert into t1 values( 5,'12.2.0.1', rpad('x',1937,'x'), rpad('x',1937,'x'), rpad('x',1937,'x')) ; insert into t1 values( 6,'12.2.0.1', rpad('x',1938,'x'), rpad('x',1938,'x'), rpad('x',1938,'x')) ; insert into t1 values( 7,'18.3.0.0', rpad('x',1984,'x'), rpad('x',1984,'x'), rpad('x',1984,'x')) ; insert into t1 values( 8,'18.3.0.0', rpad('x',1985,'x'), rpad('x',1985,'x'), rpad('x',1985,'x')) ; begin dbms_stats.gather_table_stats( ownname => null, tabname => 'T1', method_opt => 'for all columns size 1' ); end; / break on version skip 1 on report compute avg of sys_op_opnsize(c1) on report compute avg of sys_op_opnsize(nc1) on report select version, n1, length(v1), length(c1), length(nc1), sys_op_opnsize(v1), sys_op_opnsize(c1), sys_op_opnsize(nc1) from t1 order by n1 ; select avg_row_len from user_tables where table_name = 'T1' ; select column_name, avg_col_len from user_tab_cols where table_name = 'T1' order by column_id ;
I’ve created a table holding a varchar2() a CLOB, and an NCLOB, then I’ve inserted some rows into that table, labelling the rows in pairs with what appear to be Oracle version numbers, with one extra row labelled “General” that holds the special “empty LOB“ value (note a NULL and an empty clob behave very differently). Then I’ve reported the lengths of the LOB columns in two different ways, once using the standard length() function (I could have used the dbms_lob.getlength() function) and once using the internal sys_op_opnsize() function that Oracle uses in its queries to gather table stats.
Here’s the output from the script. Since this first run is on 11.2.0.4 I want you to note, particularly, the numbers in the rows labelled 11.2.0.4
VERSION N1 LENGTH(V1) LENGTH(C1) LENGTH(NC1) SYS_OP_OPNSIZE(V1) SYS_OP_OPNSIZE(C1) SYS_OP_OPNSIZE(NC1) -------- ---------- ---------- ---------- ----------- ------------------ ------------------ ------------------- General 0 0 0 96 96 11.2.0.4 1 1951 1951 1951 1951 2048 3999 2 1952 1952 1952 1952 2049 86 12.1.0.2 3 1951 1951 1951 1951 2048 3999 4 1952 1952 1952 1952 2049 86 12.2.0.1 5 1937 1937 1937 1937 2034 3971 6 1938 1938 1938 1938 2035 3973 18.3.0.0 7 1984 1984 1984 1984 2081 86 8 1985 1985 1985 1985 2082 86 ******** ------------------ ------------------- avg 1835.77778 1820.22222 AVG_ROW_LEN ----------- 5410 COLUMN_NAME AVG_COL_LEN -------------------- ----------- N1 3 VERSION 9 V1 1740 C1 1837 NC1 1822
You might note, by the way, that the avg_col_len is the average (which I’ve reported for the CLOB and NCLOB columns) of the sys_op_opnsize() values rounded up plus 1 (for the “column count” byte); and – because the number of columns in the table is small – the avg_row_len is very similar to the sum of the avg_col_len.
You should also note that the storage size of an “empty” LOB is (or seems to be) 96 bytes in this version of Oracle. That’s quite a lot of space to say that there’s no data – but using an empty_[n]clob() to represent “nothing” does mean that you don’t have to code for “is null or is empty” and it may help you to avoid incorrect results as a consequence. In fact a block dump shows that the actual space usage for the empty_[n]clob() is only 30 bytes – so it’s not quite as bad as it seems. The error is probably based around the older code using the maximum possible length of a lob locator as an assumption rather than checking the actual size in the table.
It’s a minor variation of the standard warning: “odd little things change when you upgrade and some of them may have a big side effect”. Here are the results from 18.3 (which, amongst other details, defaults to a multi-byte character set – which I’ll check before I show you the LOB results).
SQL> select * from v$nls_parameters where parameter like '%CHARACTERSET%'; PARAMETER VALUE -------------------------------- -------------------- NLS_CHARACTERSET AL32UTF8 NLS_NCHAR_CHARACTERSET AL16UTF16
This affects the storage of CLOBs if you’ve previously been using a default single-byte character set – you’re suddenly going to find your LOB segments are twice as big as they used to be – even though any report of “length()” will be unchanged. But there are other changes. Here’s the output from the same script running on 18.3.0.0:
VERSION N1 LENGTH(V1) LENGTH(C1) LENGTH(NC1) SYS_OP_OPNSIZE(V1) SYS_OP_OPNSIZE(C1) SYS_OP_OPNSIZE(NC1) -------- ---------- ---------- ---------- ----------- ------------------ ------------------ ------------------- General 0 0 0 30 30 11.2.0.4 1 1951 1951 1951 1951 3933 3933 2 1952 1952 1952 1952 3935 3935 12.1.0.2 3 1951 1951 1951 1951 3933 3933 4 1952 1952 1952 1952 3935 3935 12.2.0.1 5 1937 1937 1937 1937 3905 3905 6 1938 1938 1938 1938 3907 3907 18.3.0.0 7 1984 1984 1984 1984 3999 3999 8 1985 1985 1985 1985 132 132 ******** ------------------ ------------------- avg 3078.77778 3078.77778 AVG_ROW_LEN ----------- 7912 COLUMN_NAME AVG_COL_LEN -------------------- ----------- N1 3 VERSION 9 V1 1740 C1 3080 NC1 3080
Looking at the details for “General” and “18.3.0.0” what can we see that’s changed?
First, the length of an empty_[n]clob is now reported correctly at 30 bytes. This might (if the value is also used internally) explain why a LOB can now be 33 characters (66 bytes) longer before it’s moved out of line – it’s a small difference, but you might be at a boundary condition where it makes a big difference (for good, or for bad – who can tell) or your system.
Secondly, the LOB metadata for an out of line LOB seems to have jumped from 86 bytes to 132 bytes. Like the empty_[n]clob() issue, this is an error. The actual space usage in row was 38 bytes – consisting of the basic 30 bytes “empty” metadata, 4 extra bytes overhead, and a 4-byte block address linking to the stored LOB value. (For larger LOBs you will probably see that the “tail” of the in-row data grows to accomodate a short list of block addresses and chunk information).
Finally, you’re allowed to guess that the rows labelled 12.2.0.1 are there to give you a hint that in that version the NCLOB moves out of line at 1,938 characters – but you’ll probably want to run the test on 12.2 to confirm that claim. This does mean, though, that an upgrade to 12.2 from a lower version might be a bigger threat than any upgrade to 18.3. You’ll also find that in 12.2 the empty_[n]clob() is reported with a length of 124. (Again, this may explain the change in the break-point for going out of line: 124 – 96 = 28, and (1952 – 28/2) = 1,938. The arithmetic is consistent with the incorrect reporting of the size of the empty LOB metadata.
So the bottom line is this – if you do a lot of work with LOBs, do some careful checking of how much space you are using, how much space you will use on the next install, how changes in character sets can affect numbers, how the statistics might change – even when collected in exactly the same way – and what this might do to execution plans.
This testing was all done using securefile LOBs – which has been the default for some time if you don’t specify the type of LOB you want to use. The numbers, particularly the break from in-row to out-of-row, are a little different if you switch to basicfile LOBs.
A quick test on LiveSQL (19.2) suggests that nothing has changed from 18.3 to 19.3
For further reading on quirky details of LOBs – here’s the article that has a footnote reporting another interesting change in LOB handling in 18.3; and a very old note about analysing the contents of a LOB segment with a view to assessing the impact of deleting old LOBs and potentially moving the LOB segment to a tablespace of a different size.
Recent comments
1 year 44 weeks ago
2 years 4 weeks ago
2 years 9 weeks ago
2 years 10 weeks ago
2 years 14 weeks ago
2 years 35 weeks ago
3 years 3 weeks ago
3 years 33 weeks ago
4 years 17 weeks ago
4 years 18 weeks ago