Since I’m starting employment with Pythian on Monday, this is the last technical post that is exclusive to this blog. Future technical tips, discoveries and stories will appear on the Pythian blog, and either copied here or linked to from here. BTW. I already made my first post in the Pythian blog today, as the editor of the Log Buffer – and I’m not even an employee yet
So, lets end this with a bang. Real life story on how Daylight Saving Time caused one query to take twice as long as before! And its not even an April Fools joke
Three days ago the customer called me and said: Listen, since daylight saving time started, this process that should take two hours is now taking five. My manager gets his morning report late and he is unhappy.
I told him that I’ll be glad to help with his performance issue, but this can’t possibly related to day light saving time. Besides, since the process runs every night but some nights the performance is fine, the problem is probably in the nature of the data that changes from day to day.
In any case, we started working on the problem with our usual methods: Instrumented the PL/SQL of the process so we can know exactly what part is slowing down the process, ran trace, analyzed the results of traces, looks at AWR reports, look at query plans, etc.
Eventually, we found a query, buried within a view, whose explain plan looked completely unreasonable – First of all, it included a bunch of “nested loops” and even “cartesian merge join” between tables that I knew were quite large. Second, those large tables had estimated number of rows of 1.
I checked DBA_TABLES and saw that num_rows was indeed 0, and last analyzed was yesterday at 10pm. This didn’t make any sense – I personally collected statistics for this table yesterday, it had around 1M rows and it was at 3pm.
Why does it show 0 rows at 10pm?
A quick check with the customer confirmed that the process truncates this table and copies data into it before our slow query runs. I assumed that “truncate table” resets the statistics and asked the customer to modify the process to analyze the table after all the data is copied in, but before the slow query runs.
But two questions still bothered me: Does truncate really resets statistics? And why did this problem start occurring after day light saving time?
The answer to the first question is “no”, truncate does not reset statistics:
SQL> analyze table sitescope.test1 compute statistics; Table analyzed. SQL> select table_name,num_rows from dba_tables where owner='SITESCOPE' and table_name='TEST1'; TABLE_NAME NUM_ROWS ------------------------------ ---------- TEST1 4 SQL> truncate table sitescope.test1; Table truncated. SQL> select table_name,num_rows from dba_tables where owner='SITESCOPE' and table_name='TEST1'; TABLE_NAME NUM_ROWS ------------------------------ ---------- TEST1 4
The answer to the second question was found using DBA_TAB_STATS_HISTORY.
Turns out that there is an automated process that collects statistics on this schema every night. Until DST, it ran every night at 9pm, before the process started running – so the table had data and the collected statistics were accurate. After DST, the process started running at 10pm (obviously the system running the process doesn’t know about DST). At 10pm the table was already truncated, but was not loaded yet, so the process collected statistics for a table that was temporarily empty.
Tada! Daylight saving time caused a severe performance problem!
P.S. No indexes were rebuilt while attempting to resolve the issue
Chen (Gwen) Shapira