Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Dynamic Values in Linux Scripting

I do a LOT of scripting. Given the choice to click in a GUI vs. typing at the command line, I’ll choose the command line. Given the choice to type commands in repeatedly vs. scripting out a task I perform more than twice, I’ll script. Scripting effectively is an art as much as it’s a science.

My idea of science

Where a GUI can change, both in content, as well as layout, a script is less impacted by this when it is designed to dynamically work with the catalog. You have the choice to either work with the values in an array or to just pull it into a temporary file to work with as part of the script. For the example, I’ll stick with the latter to make our example easier to reproduce.

Let’s start with a use case of deploying a Azure database. When a customer is making the decision to build it out, there are specific information needed to deploy and this will continue to change as the Azure catalog is updated with new offerings. For our example, we’ll stick to a very small snippet of code, as the values we dynamically create will be reused throughout the script. This example will skip past the actual server creation, etc. and just focus on the user database creation. The Server, zone and subscription are all set in the default steps earlier on so as not to have to repeat it throughout each resource deployment step.

The first thing an author of a CLI BASH script will need to know is how to build a SQL Database from the command line per the documentation:

az sql db create \
         --resource-group  \
         --name  \
         --service-objective  \
         --capacity 
         --zone-redundant 

We will then assess what parts of the deployment will need to be dynamically updated upon each deployment and change those to variables:

 az sql db create \
--resource-group $groupname \
--name $holname"_"$use \
--service-objective $sku \
--capacity $cap \
--zone-redundant false

For the snippet above, it was easy to decide that I wouldn’t be creating zone redundant databases very often, so decided to hard-code this to a value of false, but the rest of the values, I want to generate dynamically.

My script will ask as part of the script execution for the name for the following and then reuse it throughout the script deployment:

Resource Group Name = $groupname

There is a deployment tag acronym that is used to generate the server name, database names and other resources. Each of the resources, depending on type, will then build out from there:

(holname = utw) + (use=DW) so the dbname= utw_DW

The intriguing ones, working is the values for sku and capacity. These can change regularly with Azure and they will need to be pulled dynamically from the Azure catalog. To do this, my BASH script in the CLI does a great job.

I first capture the information, asking Azure for all database versions, by the zone value passed that are available, (True) for that zone. I output the information in a table format to a file titled wh.lst:

az sql db list-editions -l $zone -o table | grep True > wh.lst
Example of file contents:
 P1     Premium        Premium                     125         DTU     True
 P2     Premium        Premium                     250         DTU     True
 P4     Premium        Premium                     500         DTU     True
 P6     Premium        Premium                     1000        DTU     True
 P11    Premium        Premium                     1750        DTU     True
 P15    Premium        Premium                     4000        DTU     True

I now have this file to be used to pull information I need as I proceed through my script. The next step is to “cat” the data from the file and then use an AWK command to pull the first and fourth word from the table to create a list of Skus and DTU sizing for the person executing the script to choose from:

cat wh.lst | awk '{print "SKU:"$1,"DTU:"$4}' | tr -d \"\,
Output example:
 SKU:P1 DTU:125
 SKU:P2 DTU:250
 SKU:P4 DTU:500
 SKU:P6 DTU:1000
 SKU:P11 DTU:1750
 SKU:P15 DTU:4000
...
 SKU:DW400 DTU:3000
 SKU:DW400c DTU:3600
 SKU:DW500 DTU:3750
....

I ask the user to choose a sku from the list and enter it into the prompt. The sku chosen will become the $brcksize value and the DTU value for that sku, will become the $cap, or capacity in my deployment step.

export cap=$(cat wh.lst | grep $brcksize" " | awk '{print $4}' | tr -d \"\,)

If we choose one of the skus from above for the SQL Warehouse, like DW400, you’ll notice it doesn’t confuse DW400c and return both because I’ve instructed my command to add a space after , (grep $brcksize” “). These are the details that have to be thought through and tested as you automate processes.

For our example database, we’ll create a premium Azure DB, choosing the P6 sku:

cat wh.lst | grep P6" " | awk '{print $4}' | tr -d \"\,
Output from the above: 3000

We now can see that with this process, the script will replace the following values in our database creation with a fully qualified Azure CLI command to create a P6 database:

 az sql db create \
         --resource-group UTW_Group \
         --name utw_DW \
         --service-objective P6  \
         --capacity 1000 \
         --zone-redundant false 

I hope this short example demonstrates the value of dynamic linux scripting to not only automate, but to create dynamic automation that can be reused over and over with infinite deployments, scaled to the needs of the project.



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBAKevlar [Dynamic Values in Linux Scripting], All Right Reserved. 2019.

Take care with automatic indexes

This one came in via an AskTOM question. You need to be careful when disabling constraints that are underpinned by an index, especially when it comes to the storage for that index.

Let’s take a look at a simple example. I’ll explicitly nominate the LARGETS tablespace for my index that will support the primary key.



SQL> create table t as
  2      select 1 x, 1 y from dual;

Table created.

SQL>
SQL> alter table t add constraint
  2    t_pk primary key ( x ) using index tablespace largets;

Table altered.

SQL>
SQL> select tablespace_name from user_segments
  2  where segment_name = 'T_PK';

TABLESPACE_NAME
--------------------
LARGETS

1 row selected.

SQL>
SQL> alter table t modify constraint t_pk disable;

Table altered.

SQL> select * from user_indexes where index_name = 'T_PK';

no rows selected

SQL> alter table t modify constraint t_pk enable;

Table altered.

Now let us look at where the index that is needed for that constraint has been re-created.


SQL> select tablespace_name from user_segments
  2  where segment_name = 'T_PK';
TABLESPACE_NAME
--------------------
USERS

The index is a quite distinct entity from the constraint, and unless you nominate a tablespace it will be created in the default tablespace for the schema, which in my case was USERS.

So take care – you might need to pre-create indexes or explicitly rebuild them with a tablespace specification before re-enabling those constraints.


SQL> create table t as
  2      select 1 x, 1 y from dual;

Table created.

SQL>
SQL> create unique index t_pk on t ( x ) tablespace largets;

Index created.

SQL> alter table t add constraint   t_pk primary key ( x ) using index ;

Table altered.

SQL>
SQL> select tablespace_name from user_segments
  2  where segment_name = 'T_PK';

TABLESPACE_NAME
--------------------
LARGETS

1 row selected.

SQL>
SQL> alter table t modify constraint t_pk disable;

Table altered.

SQL> select tablespace_name from user_segments
  2  where segment_name = 'T_PK';

TABLESPACE_NAME
--------------------
LARGETS

1 row selected.

SQL> alter table t modify constraint t_pk enable;

Table altered.

SQL>
SQL> select tablespace_name from user_segments
  2  where segment_name = 'T_PK';

TABLESPACE_NAME
--------------------
LARGETS

1 row selected.

 

Grab all the DDL

I posted a video a couple of days ago showing a trigger mechanism to customize the capture of DDL that is issued on your database. The aim here is to be more generous with letting developers execute DDL on their Development databases, whilst still having a thorough record of the changes that are occurring. Of course, it goes without saying, which is why I am saying it </p />
</p></div>

    	  	<div class=

Bloom Filter Efficiency And Cardinality Estimates

I've recently came across an interesting observation I've not seen documented yet, so I'm publishing a simple example here to demonstrate the issue.

In principle it looks like that the efficiency of Bloom Filter operations are dependent on the cardinality estimates. This means that in particular cardinality under-estimates of the optimizer can make a dramatic difference how efficient a corresponding Bloom Filter operation based on such a cardinality estimate will work at runtime. Since Bloom Filters are crucial for efficient processing in particular when using Exadata or In Memory column store this can have significant impact on the performance of affected operations.

While other operations based on SQL workareas like hash joins for example can be affected by such cardinality mis-estimates, too, these seem to be capable of adapting at runtime - at least to a certain degree. However I haven't seen such an adaptive behaviour of Bloom Filter operations at runtime (not even when executing the same statement multiple times and statistics feedback not kicking in).

To demonstrate the issue I'll create two simple tables that get joined and one of them gets a filter applied:

#eeeeee; border: 1px dashed rgb(204 , 204 , 204); overflow: auto;">create table t1 parallel 4 nologging compress
as
with
generator1 as
(
select /*+
cardinality(1e3)
materialize
*/
rownum as id
, rpad('x', 100) as filler
from
dual
connect by
level <= 1e3
),
generator2 as
(
select /*+
cardinality(1e4)
materialize
*/
rownum as id
, rpad('x', 100) as filler
from
dual
connect by
level <= 1e4
)
select
id
, id as id2
, rpad('x', 100) as filler
from (
select /*+ leading(b a) */
(a.id - 1) * 1e4 + b.id as id
from
generator1 a
, generator2 b
)
;

alter table t1 noparallel;

create table t2 parallel 4 nologging compress as select * from t1;

alter table t2 noparallel;

All I did here is create two tables with 10 million rows each, and I'll look at the runtime statistics of the following query:

#eeeeee; border: 1px dashed rgb(204 , 204 , 204); overflow: auto;">select /*+ no_merge(x) */ * from (
select /*+
leading(t1)
use_hash(t2)
px_join_filter(t2)
opt_estimate(table t1 rows=1)
--opt_estimate(table t1 rows=250000)
monitor
*/
t1.id
, t2.id2
from
t1
, t2
where
mod(t1.id2, 40) = 0
-- t1.id2 between 1 and 250000
and t1.id = t2.id
) x
where rownum > 1;

Note: If you try to reproduce make sure you get actually a Bloom Filter operation - in an unpatched version 12.1.0.2 I had to add a PARALLEL(2) hint to actually get the Bloom Filter operation.

The query filters on T1 so that 250K rows will be returned and then joins to T2. The first interesting observation regarding the efficiency of the Bloom Filter is that the actual data pattern makes a significant difference: When using the commented filter "T1.ID2 BETWEEN 1 and 250000" the resulting cardinality will be same as when using the "MOD(T1.ID2, 40) = 0", but the former will result in a perfect filtering of the Bloom Filter regardless of the OPT_ESTIMATE hint used, whereas when using the latter the efficiency will be dramatically different.

This is what I get when using version 18.3 (12.1.0.2 showed very similar results) and force the under-estimate using the OPT_ESTIMATE ROWS=1 hint - the output is from my XPLAN_ASH script and edited for brevity:

#eeeeee; border: 1px dashed rgb(204 , 204 , 204); overflow: auto;">------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Execs | A-Rows | PGA |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1 | 0 | |
| 1 | COUNT | | | | 1 | 0 | |
|* 2 | FILTER | | | | 1 | 0 | |
| 3 | VIEW | | 1 | 26 | 1 | 250K | |
|* 4 | HASH JOIN | | 1 | 24 | 1 | 250K | 12556K |
| 5 | JOIN FILTER CREATE| :BF0000 | 1 | 12 | 1 | 250K | |
|* 6 | TABLE ACCESS FULL| T1 | 1 | 12 | 1 | 250K | |
| 7 | JOIN FILTER USE | :BF0000 | 10M| 114M| 1 | 10000K | |
|* 8 | TABLE ACCESS FULL| T2 | 10M| 114M| 1 | 10000K | |
------------------------------------------------------------------------------------

The Bloom Filter didn't help much, only a few rows were actually filtered (otherwise my XPLAN_ASH script would have shown "10M" as actually cardinality instead of "10000K", which is something slightly less than 10M rounded up).

Repeat the same but this time using the OPT_ESTIMATE ROWS=250000 hint:

#eeeeee; border: 1px dashed rgb(204 , 204 , 204); overflow: auto;">-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Execs | A-Rows| PGA |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 1 | 0 | |
| 1 | COUNT | | | | | 1 | 0 | |
|* 2 | FILTER | | | | | 1 | 0 | |
| 3 | VIEW | | 252K| 6402K| | 1 | 250K | |
|* 4 | HASH JOIN | | 252K| 5909K| 5864K| 1 | 250K | 12877K |
| 5 | JOIN FILTER CREATE| :BF0000 | 250K| 2929K| | 1 | 250K | |
|* 6 | TABLE ACCESS FULL| T1 | 250K| 2929K| | 1 | 250K | |
| 7 | JOIN FILTER USE | :BF0000 | 10M| 114M| | 1 | 815K | |
|* 8 | TABLE ACCESS FULL| T2 | 10M| 114M| | 1 | 815K | |
-------------------------------------------------------------------------------------------

So we end up with exactly the same execution plan but the efficiency of the Bloom Filter at runtime has changed dramatically due to the different cardinality estimate the Bloom Filter is based on.

I haven't spent much time yet with the corresponding undocumented parameters that might influence the Bloom Filter behaviour, but when I repeated the same and used the following settings in the session (and ensuring an adequate PGA_AGGREGATE_TARGET setting otherwise the hash join might be starting spilling to disk, which means the Bloom Filter size is considered when calculating SQL workarea sizes):

#eeeeee; border: 1px dashed rgb(204 , 204 , 204); overflow: auto;">alter session set "_bloom_filter_size" = 1000000;

I got the following result:

#eeeeee; border: 1px dashed rgb(204 , 204 , 204); overflow: auto;">-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Execs | A-Rows| PGA |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1 | 0 | |
| 1 | COUNT | | | | 1 | 0 | |
|* 2 | FILTER | | | | 1 | 0 | |
| 3 | VIEW | | 1 | 26 | 1 | 250K | |
|* 4 | HASH JOIN | | 1 | 24 | 1 | 250K | 12568K |
| 5 | JOIN FILTER CREATE| :BF0000 | 1 | 12 | 1 | 250K | |
|* 6 | TABLE ACCESS FULL| T1 | 1 | 12 | 1 | 250K | |
| 7 | JOIN FILTER USE | :BF0000 | 10M| 114M| 1 | 815K | |
|* 8 | TABLE ACCESS FULL| T2 | 10M| 114M| 1 | 815K | |
-----------------------------------------------------------------------------------

which shows a slightly increased PGA usage compared to the first output but the same efficiency as when having the better cardinality estimate in place.

Increasing the size I couldn't however convince Oracle to make the Bloom Filter even more efficient, even when the better cardinality estimate was in place.

Summary

Obviously the efficiency / internal sizing of the Bloom Filter vector at runtime depends on the cardinality estimates of the optimizer. Depending on the actual data pattern this can make a significant difference in terms of efficiency. Yet another reason why having good cardinality estimates is a good thing and yet sometimes so hard to achieve, in particular for join cardinalities.

Footnote

On MyOracleSupport I've found the following note regarding Bloom Filter efficiency:

Bug 8932139 - Bloom filtering efficiency is inversely proportional to DOP (Doc ID 8932139.8)

Another interesting behaviour - the bug is only fixed in version 19.1 but also included in the latest RU(R)s of 18c and 12.2 from January 2019 on.

Chinar Aliyev's Blog

Chinar Aliyev has recently started to pick up on several of my blog posts regarding Parallel Execution and the corresponding new features introduced in Oracle 12c.

It is good to see that obviously Oracle has since then improved some of these and added new ones as well.

Here are some links to the corresponding posts:

New automatic Parallel Outer Join Null Handling in 18c

Improvements regarding automatic parallel distribution skew handling in 18c

Chinar has also put some more thoughts on the HASH JOIN BUFFERED operation:

New thoughts about the HASH JOIN BUFFERED operation

There are also a number of posts on his blog regarding histograms and in particular how to properly calculate the join cardinality in the presence of additional filters and resulting skew, which is a very interesting topic and yet to be handled properly by the optimizer even in the latest versions.

Thanks for your feedback Rafael.

Thanks for your feedback Rafael. I think this is 12c drivers but works on lower database versions, excepts some bugs. So better test it…

Parse Calls

When dealing with the library cache / shared pool it’s always worth checking from time to time to see if a new version of Oracle has changed any of the statistics you rely on as indicators of potential problems. Today is also (coincidentally) a day when comments about “parses” and “parse calls” entered my field of vision from two different directions. I’ve tweeted out references to a couple of quirkly little posts I did some years ago about counting parse calls and what a parse call may entail, but I thought I’d finish the day off with a little demo of what the session cursor cache does for you when your client code issues parse calls.

There are two bit of information I want to highlight – activity in the library cache and a number that shows up in the session statistics. Here’s the code to get things going:

rem
rem     Script:         12c_session_cursor_cache.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Apr 2019
rem
rem     Note:
rem     start_1.sql contains the one line
rem          select * from t1 where n1 = 0;
rem

create table t1 
as
select 99 n1 from dual
;

execute dbms_stats.gather_table_stats(user,'t1')

spool 12c_session_cursor_cache

prompt  =======================
prompt  No session cursor cache
prompt  =======================

alter session set session_cached_cursors = 0;

set serveroutput off
set feedback off

execute snap_libcache.start_snap
execute snap_my_stats.start_snap

execute snap_libcache.start_snap
execute snap_my_stats.start_snap

@start_1000

set feedback on
set serveroutput on

execute snap_my_stats.end_snap
execute snap_libcache.end_snap


prompt  ============================
prompt  Session cursor cache enabled
prompt  ============================


alter session set session_cached_cursors = 50;

set serveroutput off
set feedback off

execute snap_libcache.start_snap
execute snap_my_stats.start_snap

execute snap_libcache.start_snap
execute snap_my_stats.start_snap

@start_1000

set feedback on
set serveroutput on

execute snap_my_stats.end_snap
execute snap_libcache.end_snap

spool off

I’ve made use of a couple of little utilities I wrote years ago to take snapshots of my session statistics and the library cache (v$librarycache) stats. I’ve also used my “repetition” framework to execute a basic query 1,000 times. The statement is a simple “select from t1 where n1 = 0”, chosen to return no rows.

The purpose of the whole script is to show you the effect of running exactly the same SQL statement many times – first with the session cursor cache disabled (session_cached_cursors = 0) then with the cache enabled at its default size.

Here are some results from an instance of 12.2.0.1 – which I’ve edited down by eliminating most of the single-digit numbers.

=======================
No session cursor cache
=======================
---------------------------------
Session stats - 23-Apr 17:41:06
Interval:-  4 seconds
---------------------------------
Name                                                                         Value
----                                                                         -----
Requests to/from client                                                      1,002
opened cursors cumulative                                                    1,034
user calls                                                                   2,005
session logical reads                                                        9,115
non-idle wait count                                                          1,014
session uga memory                                                          65,488
db block gets                                                                2,007
db block gets from cache                                                     2,007
db block gets from cache (fastpath)                                          2,006
consistent gets                                                              7,108
consistent gets from cache                                                   7,108
consistent gets pin                                                          7,061
consistent gets pin (fastpath)                                               7,061
logical read bytes from cache                                           74,670,080
calls to kcmgcs                                                              5,005
calls to get snapshot scn: kcmgss                                            1,039
no work - consistent read gets                                               1,060
table scans (short tables)                                                   1,000
table scan rows gotten                                                       1,000
table scan disk non-IMC rows gotten                                          1,000
table scan blocks gotten                                                     1,000
buffer is pinned count                                                       2,000
buffer is not pinned count                                                   2,091
parse count (total)                                                          1,035
parse count (hard)                                                               8
execute count                                                                1,033
bytes sent via SQL*Net to client                                           338,878
bytes received via SQL*Net from client                                     380,923
SQL*Net roundtrips to/from client                                            1,003

PL/SQL procedure successfully completed.

---------------------------------
Library Cache - 23-Apr 17:41:06
Interval:-      4 seconds
---------------------------------
Type      Cache                           Gets        Hits Ratio        Pins        Hits Ratio   Invalid    Reload
----      -----                           ----        ---- -----        ----        ---- -----   -------    ------
NAMESPACE SQL AREA                       1,040       1,032   1.0       1,089       1,073   1.0         0         1
NAMESPACE TABLE/PROCEDURE                   17          16    .9         101          97   1.0         0         0
NAMESPACE BODY                               9           9   1.0          26          26   1.0         0         0
NAMESPACE SCHEDULER GLOBAL ATTRIBU          40          40   1.0          40          40   1.0         0         0

PL/SQL procedure successfully completed.

The thing to notice, of course, is the large number of statistics that are (close to) multiples of 1,000 – i.e. the number of executions of the SQL statement. In particular you can see the ~1,000 “parse count (total)” which is not reflected in the “parse count (hard)” because the statement only needed to be loaded into the library cache and optimized once.

The other notable statistics come from the library cache where we do 1,000 gets and pins on the “SQL AREA” – the “get” creates a “KGL Lock” (the “breakable parse lock”) that is made visible as an entry in v$open_cursor (x$kgllk), and the the “pin” created a “KGL Pin” that makes it impossible for anything to flush the child cursor from memory while we’re executing it.

So what changes when we enabled the session cursor cache:


============================
Session cursor cache enabled
============================

Session altered.

---------------------------------
Session stats - 23-Apr 17:41:09
Interval:-  3 seconds
---------------------------------
Name                                                                         Value
----                                                                         -----
Requests to/from client                                                      1,002
opened cursors cumulative                                                    1,004
user calls                                                                   2,005
session logical reads                                                        9,003
non-idle wait count                                                          1,013
db block gets                                                                2,000
db block gets from cache                                                     2,000
db block gets from cache (fastpath)                                          2,000
consistent gets                                                              7,003
consistent gets from cache                                                   7,003
consistent gets pin                                                          7,000
consistent gets pin (fastpath)                                               7,000
logical read bytes from cache                                           73,752,576
calls to kcmgcs                                                              5,002
calls to get snapshot scn: kcmgss                                            1,002
no work - consistent read gets                                               1,000
table scans (short tables)                                                   1,000
table scan rows gotten                                                       1,000
table scan disk non-IMC rows gotten                                          1,000
table scan blocks gotten                                                     1,000
session cursor cache hits                                                    1,000
session cursor cache count                                                       3
buffer is pinned count                                                       2,000
buffer is not pinned count                                                   2,002
parse count (total)                                                          1,002
execute count                                                                1,003
bytes sent via SQL*Net to client                                           338,878
bytes received via SQL*Net from client                                     380,923
SQL*Net roundtrips to/from client                                            1,003

PL/SQL procedure successfully completed.

---------------------------------
Library Cache - 23-Apr 17:41:09
Interval:-      3 seconds
---------------------------------
Type      Cache                           Gets        Hits Ratio        Pins        Hits Ratio   Invalid    Reload
----      -----                           ----        ---- -----        ----        ---- -----   -------    ------
NAMESPACE SQL AREA                           5           5   1.0       1,014       1,014   1.0         0         0
NAMESPACE TABLE/PROCEDURE                    7           7   1.0          31          31   1.0         0         0
NAMESPACE BODY                               6           6   1.0          19          19   1.0         0         0

PL/SQL procedure successfully completed.

The first thing to note is that “parse count (total)” still shows up 1,000 parse calls. However we also see the statistic “session cursor cache hits” at 1,000. Allowing for a little noise around the edges virtually every parse call has turned into a short-cut that takes us through the session cursor cache directly to the correct cursor.

This difference shows up in the library cache activity where we still see 1,000 pins – we have to pin the cursor to execute it – but we no longer see 1,000 “gets”. In the absence of the session cursor cache the session has to keep searching for the statement then creating and holding a KGL Lock while we execute the statement – but when the cache is enabled the session will very rapidly recognise that the statement is one we are likely to re-use, so it will continue to hold the KGL lock
after we have finished executing the statement and we can record the location of the KGL lock in a session state object. After the first couple of executions of the statement we no longer have to search for the statement and attach a spare lock to it, we can simply navigate from our session state object to the cursor.

As before, the KGL Lock will show up in v$open_cursor – though this time it will not disappear between executions of the statement. Over the history of Oracle versions the contents of v$open_cursor have become increasingly helpful, so I’ll just show you what the view held for my session by the end of the test:


SQL> select cursor_type, sql_text from V$open_cursor where sid = 250 order by cursor_type, sql_text;

CURSOR_TYPE                                                      SQL_TEXT
---------------------------------------------------------------- ------------------------------------------------------------
DICTIONARY LOOKUP CURSOR CACHED                                  BEGIN DBMS_OUTPUT.DISABLE; END;
DICTIONARY LOOKUP CURSOR CACHED                                  BEGIN snap_libcache.end_snap; END;
DICTIONARY LOOKUP CURSOR CACHED                                  BEGIN snap_my_stats.end_snap; END;
DICTIONARY LOOKUP CURSOR CACHED                                  SELECT DECODE('A','A','1','2') FROM SYS.DUAL
OPEN                                                             begin         dbms_application_info.set_module(
OPEN                                                             table_1_ff_2eb_0_0_0
OPEN-RECURSIVE                                                    SELECT VALUE$ FROM SYS.PROPS$ WHERE NAME = 'OGG_TRIGGER_OPT
OPEN-RECURSIVE                                                   select STAGING_LOG_OBJ# from sys.syncref$_table_info where t
OPEN-RECURSIVE                                                   update user$ set spare6=DECODE(to_char(:2, 'YYYY-MM-DD'), '0
PL/SQL CURSOR CACHED                                             SELECT INDX, KGLSTTYP LTYPE, KGLSTDSC NAME, KGLSTGET GETS, K
PL/SQL CURSOR CACHED                                             SELECT STATISTIC#, NAME, VALUE FROM V$MY_STATS WHERE VALUE !
SESSION CURSOR CACHED                                            BEGIN DBMS_OUTPUT.ENABLE(1000000); END;
SESSION CURSOR CACHED                                            BEGIN DBMS_OUTPUT.GET_LINES(:LINES, :NUMLINES); END;
SESSION CURSOR CACHED                                            BEGIN snap_libcache.start_snap; END;
SESSION CURSOR CACHED                                            BEGIN snap_my_stats.start_snap; END;
SESSION CURSOR CACHED                                            select * from t1 where n1 = 0
SESSION CURSOR CACHED                                            select /*+ no_parallel */ spare4 from sys.optstat_hist_contr

17 rows selected.

The only one of specific interest is the penultimate one in the output – its type is “SESSION CURSOR CACHED” and we can recognise our “select from t1” statement.

Bitmap Index On Column With 212552698 Distinct Values, What Gives? (I’d Rather Be High)

In my previous post on Indexing The Autonomous Warehouse, I highlighted how it might be necessary to create indexes to improve the performance and scalability of highly selective queries, as it might on any Data Warehouse running on an Exadata platform. In the post, I created a Bitmap Index and showed how improve SQL performance […]

You should set OCSID.CLIENTID

each time you grab an Oracle JDBC connection from the pool

For troubleshooting and monitoring performance, you want to follow what happens from the end-user to the database. It is then mandatory to identify the end-user and application from the database session. With Oracle there are some ‘dbms_application_info’ strings to be set, like MODULE, ACTION and CLIENT_INFO. That’s about the tasks in the application code (like identifying the Java class or method from which the SQL statement is prepared) but that’s not about the end-user.

And you should forget about the CLIENT_INFO which is not very useful and rather misleading. OCSID.MODULE and OCSID.ACTION are set from JDBC with Connection.setClientInfo (One reason I find the CLIENT_INFO name misleading is that it cannot be set with setClientInfo). Of course, you can also call ‘dbms_application_info.set_module’ but that’s an additional call to the database (which means network latency, OS context switch,…). Using the JDBC setClientInfo with the OCSID namespace sends this information with the next call.

Now, about identifying the end-user, there’s the session CLIENT_ID (aka CLIENT_IDENTIFIER) that you can also set with Connection.setClientInfo (OCSID.CLIENTID). This one is visible in many Oracle views and follows the database links. Here is an example, I create a demo user and a database link:

connect sys/oracle@//localhost/PDB1 as sysdba
drop public database link PDB1@SYSTEM;
grant dba to demo identified by demo;
create public database link PDB1@SYSTEM connect to SYSTEM
identified by oracle using '//localhost/PDB1';

The following JavaScript (run from SQLcl) connects with a JDBC Thin driver, sets OCSID.MODULE, OCSID.ACTION and OCSID.CLIENTID, and displays CLIENT_IDENTIFIER, MODULE and ACTION from V$SESSION:

script
var DriverManager = Java.type("java.sql.DriverManager");
var con = DriverManager.getConnection(
"jdbc:oracle:thin:@//localhost/PDB1","demo","demo"
);
con.setAutoCommit(false);
function showSessionInfo(){
var sql=con.createStatement();
var res=sql.executeQuery("\
select client_identifier,service_name,module,action,value \
from v$session \
join v$mystat using(sid) \
join v$statname using(statistic#) \
where name='user calls' \
");
while(res.next()){
print();
print(" CLIENT_IDENTIFIER: "+res.getString(1));
print(" SERVICE: "+res.getString(2));
print(" MODULE: "+res.getString(3));
print(" ACTION: "+res.getString(4));
print(" User Calls: "+res.getInt(5));
print();
}
}
showSessionInfo();
con.setClientInfo('OCSID.CLIENTID','my Client ID');
con.setClientInfo('OCSID.MODULE','my Module');
con.setClientInfo('OCSID.ACTION','my Action');
showSessionInfo();
// run a statement through DBLINK:
var sql=con.createStatement();
sql.executeUpdate("call dbms_output.put_line@PDB1@SYSTEM(null)");

I also display the ‘user calls’ from V$MYSTAT. Here is the output:

SQL> .
CLIENT_IDENTIFIER: null
SERVICE: pdb1
MODULE: JDBC Thin Client
ACTION: null
User Calls: 4
CLIENT_IDENTIFIER: my Client ID
SERVICE: pdb1
MODULE: my Module
ACTION: my Action
User Calls: 5

The second execution sees the MODULE, ACTION and CLIENT_IDENTIFIER set with the previous setClientInfo(). And the most important is that the ‘user calls’ statistic has been incremented only by one, which means that setting them did not add any additional roundtrips to the database server.

Now, after the call through database link, I display all user sessions from V$SESSION. I can see my SQLcl (java) with nothing set, the JDBC thin session with MODULE, ACTION and CLIENT_IDENTIFIER, and the DBLINK session (connected to SYSTEM) with only the CLIENT_IDENTIFIER set:

SQL> select username,client_identifier,module,action
2 from v$session where type='USER';
  USERNAME   CLIENT_IDENTIFIER                   MODULE       ACTION
__________ ___________________ ________________________ ____________
SYSTEM my Client ID oracle@db192
SYS java@db192 (TNS V1-V3)
DEMO my Client ID my Module my Action

Following the end-user down to all layers (application, database, remote databases) is great for end-to-end troubleshooting and performance analysis. Set this OCSID.CLIENTID to identify the application (micro-)service and the end-user (like a browser Session ID), for no additional cost, and you will find this information in many performance views:

select table_name, listagg(distinct column_name,', ') 
within group (order by column_name)
from dba_tab_columns
where column_name in ('CLIENT_IDENTIFIER','CLIENT_INFO','CLIENT_ID','MODULE','ACTION')
--and table_name like 'GV%'
group by table_name
order by 2;

You see how the ‘CLIENT_INFO’ is useless (except for an additional level to module/action for SQL Monitor) and how CLIENT_ID(ENTIFIER) is everywhere, including ASH (Active Session history).

With a micro-services architecture, you will have many connections to the database (don’t tell me that each microservice has its own database — databases were invented decades ago when streaming data everywhere was an un-maintainable/un-scalable/errorprone mess, and schemas and views were invented to provide this data-micro-services within the same database system). Then the best practice is to:

  • connect with a dedicated SERVICE_NAME
  • identify the end-user with a CLIENT_ID

and then end-to-end tracing, tuning and troubleshooting will become easy.

Not Just the How of AD with Linux VM/SQL 2019, but the WHY

Azure Directory is available with Linux SQL Server 2019 in Preview and as I was setting it up in my Azure environment on a Linux Red Hat 7.3 VM, I was, as many are, happy that they list the commands for the Azure CLI to set up authentication with Azure Directory, but was concerned, that with so many new to Linux, that they didn’t describe in the steps WHY we were running certain commands or setting best practices around Linux database server design.

The setup expects that you already have a Linux VM and SQL 2019 already up and running. The first step they go into is role assignment for the AD login, setting the AD login up as the VM Administrator.

az vm extension set \     
--publisher Microsoft.Azure.ActiveDirectory.LinuxSSH \     
--name AADLoginForLinux \     
--resource-group myResourceGroup \     
--vm-name myVM

The above command expects you to replace “myResourceGroup” and “myVM” with the correct values for the resource group and Linux VM.

The next step then proceeds to run more AZ commands, but it also expects the DBA or administrator to be familiar with Linux and Linux scripting techniques after NOT using them in the first command. I find this assumption leaving us all open to human error:

https://dbakevlar.com/wp-content/uploads/2019/04/vm_cmds_doc-300x196.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/04/vm_cmds_doc-768x501.jpg 768w, https://dbakevlar.com/wp-content/uploads/2019/04/vm_cmds_doc.jpg 1583w" sizes="(max-width: 1024px) 100vw, 1024px" />
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/login-using-aad

Let’s talk about variables in an OS environment. In Windows, we can set these at the command line, both at the session or consistently as part of login scripts. We can set this in the GUI in the My Computer, advanced settings, environment variables. This also can be done for Linux, either at the command line, via login scripts referred to as run commands, (bash_rc) or profiles owned by individual logins.

To set these, you simple enter a unique word and assign it a value. Some flavors of Linux require you to “export” or “set” the variable, not just state it-

export  = 
set  = 

You can then verify the variable is set by using the ECHO command and calling the variable with a $ sign before it:

echo $

You can just as easily remove them by the “unset” command, by overwriting them with new values for the variable keyword, logging out or with other profile/run command scripts.

We can use the set and echo command process to make more sense of what is being performed as part of the Linux AD authentication instructions to the VM, too:

username=$(az account show --query user.name --output tsv) 
 
echo $username 
kgorman@microsoft.com
vm=$(az vm show --resource-group SQL2019_grp --name SQL2019RH1 --query id -o tsv)
echo $vm
/subscriptions/00aa000e-xx00x-xxx-x00-x00x-xx00x/resourceGroups/SQL2019_grp/providers/Microsoft.Compute/virtualMachines/SQL2019RH1

Now we can use these variables as part of the third command and know what is being passed into the full command, knowing that we were able to dynamically push Azure CLI commands into the final command:

 az role assignment create --role "Virtual Machine Administrator Login"     --assignee $username --scope $vm  

The output from the command shows that it pulled the original variables into the third command to complete the requirements to build the role assignment. I did a mass replace to protect the IDs, but you get the idea how to verify that the values are

{
  "canDelegate": null,
  "id": "/subscriptions/00aa000e-xx00x-xxx-x00-x00x-xx00x/resourceGroups/SQL2019_grp/providers/Microsoft.Compute/virtualMachines/SQL2019RH1/providers/Microsoft.Authorization/roleAssignments/
0e00e0b0-bf00-0000-b000-000d0e0fe0d0",
  "name": "0e00e0b0-bf00-0000-b000-000d0e0fe0d0",
  "principalId": "d0c00cba-0c0b-00f0-b00e-d00000000000",
  "resourceGroup": "SQL2019_grp",
  "roleDefinitionId": "/subscriptions/
00aa000e-xx00x-xxx-x00-x00x-xx00x/providers/Microsoft.Authorization/roleDefinitions/0c0000c0-00e0-0000-0000-ea0c00e000e0",
  "scope": "/subscriptions/
00aa000e-xx00x-xxx-x00-x00x-xx00x/resourceGroups/SQL2019_grp/providers/Microsoft.Compute/virtualMachines/SQL2019RH1",
  "type": "Microsoft.Authorization/roleAssignments"

A good update for this doc would be to first explain how and why we set variables and then use the default configurations for the session to require less commands having to be entered:

username=$(az account show --query user.name --output tsv) 
az configure --defaults group=SQL2019_grp
az configure --defaults vm=SQL2019RH1
vm=$(az vm show --query id -o tsv)

az vm extension set \     
--publisher Microsoft.Azure.ActiveDirectory.LinuxSSH \     
--name AADLoginForLinux 


"name": "AADLoginForLinux",
   "protectedSettings": null,
   "provisioningState": "Succeeded",
   "publisher": "Microsoft.Azure.ActiveDirectory.LinuxSSH",
   "resourceGroup": "SQL2019_grp",
   "settings": null,
   "tags": null,
   "type": "Microsoft.Compute/virtualMachines/extensions",
   "typeHandlerVersion": "1.0",
   "virtualMachineExtensionType": "AADLoginForLinux"

And in the end, we can check our work to verify we’ve done everything correctly:

az vm list --query '[].{Name:name, OS:storageProfile.osDisk.osType, Admin:osProfile.adminUsername}' --output table

You should see your new AD login now in the list.

This doesn’t take you too far into Linux scripting, but hopefully it answers the WHY you are running the commands that are displayed in the instructions and the order that you’re running them in.



Tags:  ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBAKevlar [Not Just the How of AD with Linux VM/SQL 2019, but the WHY], All Right Reserved. 2019.