(This is based on some ‘issues’ a client was having, so the example is architected to make a particular point, but is relevant nonetheless)
In the distributed database world, we often have different names for the same piece of information.
Consider two tables which hold customer information. In one table, they called the customer key "REF_NUM", and in the other table "CUST_ID".
That’s not ideal, but it’s not a big drama because at least the data types align.
SQL> desc T1
Name Null? Type
------------------------------------------- -------- ----------------------
REF_NUM NOT NULL NUMBER(10)
T1_DATA VARCHAR2(1000)
SQL> desc T2
Name Null? Type
------------------------------------------- -------- ----------------------
CUST_ID NOT NULL NUMBER(10)
T2_DATA VARCHAR2(1000)When I do a join between these two tables, it costs me about 9400 I/O’s
SQL> select *
2 from t1, t2
3 where t1.ref_num = t2.cust_id
4 and t1.t1_data like '100%';
Statistics
----------------------------------------------------------
0 recursive calls
1 db block gets
9408 consistent gets
0 physical readsConsider now the same two tables, with exactly the same data, but the data types are not aligned. In one system, they used a numeric, and in the other system they used a string
SQL> desc T1
Name Null? Type
------------------------------------------- -------- ----------------------
REF_NUM NOT NULL NUMBER(10)
T1_DATA VARCHAR2(1000)
SQL> desc T2
Name Null? Type
------------------------------------------- -------- ----------------------
CUST_ID NOT NULL VARCHAR2(10)
T2_DATA VARCHAR2(1000)Now look what happens to my SQL
SQL> select *
2 from t1, t2
3 where t1.ref_num = t2.cust_id
4 and t1.t1_data like '100%';
Statistics
----------------------------------------------------------
0 recursive calls
2 db block gets
18428 consistent gets
0 physical readsIt costs DOUBLE the amount of I/O’s – it runs twice as slowly.
Moreover, it’s not just performance that is at risk…You’re whole application might just start to crash randomly
Using the same tables above (with the conflicting data types) here’s a SQL that works fine "today"…
SQL> select count(*)
2 from t1, t2
3 where t1.ref_num = t2.cust_id
4 and t2.t2_data like '100%';
COUNT(*)
----------
71Then a pesky user comes along and does what users do…adds data to the one of the tables.
SQL> insert into T2 values ('X123','100some data');
1 row created.And now let’s re-run that same query again
SQL> select count(*)
2 from t1, t2
3 where t1.ref_num = t2.cust_id
4 and t2.t2_data like '100%';
ERROR: ORA-01722: invalid numberKer-splat….your application is toast…
Not being disciplined with data type selection causes pain later.
My previous post on the new TABLE_CACHED_BLOCKS statistics gathering preference certainly generated some interest My blog hits for the week have gone off the charts !! One of the concerns raised by this new capability was that setting such a preference might result in really unrealistic and inaccurate Clustering Factor (CF) values, especially for those […]![]()
Large Pages in Linux are a really interesting topic for me as I really like Linux and trying to understand how it works. Large pages can be very beneficial for systems with large SGAs and even more so for those with large SGA and lots of user sessions connected.
I have previously written about the benefits and usage of large pages in Linux here:
So now as you may know there is a change to the init.ora parameter “use_large_pages” in 11.2.0.3. The parameter can take these values:
SQL> select value,isdefault 2 from V$PARAMETER_VALID_VALUES 3* where name = 'use_large_pages' VALUE ISDEFAULT -------------------- -------------------- TRUE TRUE AUTO FALSE ONLY FALSE FALSE FALSE
There is a new value named “auto” that didn’t exist prior to 11.2.0.3. The intention is to create large pages at instance startup if possible, even if /etc/sysctl.conf doesn’t have an entry for vm.nr_hugepages at all. The risk though is that-as with dynamic creation of large pages by echoing values into /proc/sys/vm/nr_hugepages-is that you get fewer than you expect. Maybe even 0. Now I’m interested to see if that works.
So let’s have a look, my system is Oracle Linux 6.4, 64bit running virtualised. Before any database was started I checked /proc/meminfo
[root@ol64 ~]# cat /proc/meminfo MemTotal: 8192240 kB MemFree: 5090124 kB Buffers: 67408 kB Cached: 2341504 kB SwapCached: 0 kB Active: 816116 kB Inactive: 2055352 kB Active(anon): 548760 kB Inactive(anon): 284304 kB Active(file): 267356 kB Inactive(file): 1771048 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 524284 kB SwapFree: 524284 kB Dirty: 60 kB Writeback: 0 kB AnonPages: 462560 kB Mapped: 334424 kB Shmem: 370516 kB Slab: 103692 kB SReclaimable: 47496 kB SUnreclaim: 56196 kB KernelStack: 2016 kB PageTables: 26008 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 4620404 kB Committed_AS: 3343896 kB VmallocTotal: 34359738367 kB VmallocUsed: 26480 kB VmallocChunk: 34359700348 kB HardwareCorrupted: 0 kB AnonHugePages: 247808 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 8128 kB DirectMap2M: 8380416 kB
I am interested in the HugePages entries here towards the end of the list. If you have ever looked at /proc/meminfo in the previous release’s kernels (2.6.18.x to be precise) then you’ll notice it’s quite different now with a lot more information. Modern kernels are really a great step ahead. Have a look at the Outlook and References section below, this is a somewhat superficial explanation but good enough for the purpose of this article. A future post will go into more detail about SYSFS which is slated to replace the /proc file system and NUMA considerations.
Back to this article … the database I have running in my VM doesn’t use large pages, as shown in the alert.log:
Starting ORACLE instance (normal) ****************** Large Pages Information ***************** Total Shared Global Region in Large Pages = 0 KB (0%) Large Pages used by this instance: 0 (0 KB) Large Pages unused system wide = 0 (0 KB) (alloc incr 16 MB) Large Pages configured system wide = 0 (0 KB) Large Page size = 2048 KB RECOMMENDATION: Total Shared Global Region size is 2514 MB. For optimal performance, prior to the next instance restart increase the number of unused Large Pages by atleast 1257 2048 KB Large Pages (2514 MB) system wide to get 100% of the Shared Global Region allocated with Large pages ***********************************************************
So let’s change that, but dynamically and not manually. Again, a better (more predictable!) approach would be to manually add an additional 1257 large pages to /etc/sysctl.conf as recommended and reboot to ensure that they will be available when the database starts. And probably set use_large_pages to “only” to enforce their usage. But enough warnings that you probably don’t want to use the “auto” feature, I want to see this in real life!
SQL> alter system set use_large_pages=auto;
alter system set use_large_pages=auto
*
ERROR at line 1:
ORA-02095: specified initialization parameter cannot be modified
SQL> a scope=spfile;
1* alter system set use_large_pages=auto scope=spfile
SQL> /
System altered.
As you can see the parameter is static and requires an instance restart, so this is what I did next. Here is an interesting side effect of setting the parameter to “auto”: it doesn’t have an effect if you didn’t prepare the system for use of large pages in /etc/security/limits.conf. You could think that the oracle-preinstall RPM does so, but it misses the settings for “memlock”. Here is proof nothing happened:
[root@ol64 ~]# cat /proc/meminfo MemTotal: 8192240 kB MemFree: 5090124 kB Buffers: 67408 kB Cached: 2341504 kB SwapCached: 0 kB Active: 816116 kB Inactive: 2055352 kB Active(anon): 548760 kB Inactive(anon): 284304 kB Active(file): 267356 kB Inactive(file): 1771048 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 524284 kB SwapFree: 524284 kB Dirty: 60 kB Writeback: 0 kB AnonPages: 462560 kB Mapped: 334424 kB Shmem: 370516 kB Slab: 103692 kB SReclaimable: 47496 kB SUnreclaim: 56196 kB KernelStack: 2016 kB PageTables: 26008 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 4620404 kB Committed_AS: 3343896 kB VmallocTotal: 34359738367 kB VmallocUsed: 26480 kB VmallocChunk: 34359700348 kB HardwareCorrupted: 0 kB AnonHugePages: 247808 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 8128 kB DirectMap2M: 8380416 kB
HugePagesTotal is still 0, which didn’t surprise me. To allow oracle to lock memory you need to grant it the privilege. I had to edit /etc/security/limits.conf and set the memlock parameter to 5GB which is too high for my 2.5 GB SGA but setting the value a little too high doesn’t hurt at all either. The value is in kb by the way.
oracle soft memlock 5242880 oracle hard memlock 5242880
After logging out and back in as oracle I tried once more and hey-success!
Starting ORACLE instance (normal) DISM started, OS id=11969 ****************** Large Pages Information ***************** Parameter use_large_pages = AUTO Total Shared Global Region in Large Pages = 2514 MB (100%) Large Pages used by this instance: 1257 (2514 MB) Large Pages unused system wide = 0 (0 KB) (alloc incr 16 MB) Large Pages configured system wide = 1257 (2514 MB) Large Page size = 2048 KB Time taken to allocate Large Pages = 0.130804 sec *********************************************************** LICENSE_MAX_SESSION = 0
Also notice the DISM process here which is responsible for creating the large pages on the fly. This is an interesting “background process”, and Tanel Poder has mentioned it in one of his presentations already:
[root@ol64 ~]# ps -ef | grep 11969 root 11969 1 0 12:50 ? 00:00:00 ora_dism_ora11 root 12026 11911 0 12:50 pts/3 00:00:00 grep 11969 [root@ol64 bin]# ls -l $ORACLE_HOME/bin/oradism -rwsr-x---. 1 root oinstall 71758 Sep 17 2011 oradism
It is owned by root with the setuid flag set … easy to miss when cloning a home …
Once the large pages are created, the process disappears when you start the instance a second time, and there is no mention of it in the alert.log pertaining to the startup sequence. But it has done its work.
[root@ol64 ~]# grep -i page /proc/meminfo AnonPages: 346808 kB PageTables: 12852 kB AnonHugePages: 208896 kB HugePages_Total: 1257 HugePages_Free: 1045 HugePages_Rsvd: 1045 HugePages_Surp: 0 Hugepagesize: 2048 kB
Notice that not all pages are actually in use yet, I only just started the database. Don’t worry though, 100% of the SGA are allocated in large pages as per the alert.log. Over time you will notice more and more pages being in use.
Now you can of course force the database to touch all these pages, but it’s another question whether that is a good idea. You probably don’t want to do so if you have a large SGA, the startup time can be very long. For the sake of completeness I added this here though to show you the effect in /proc/meminfo. I set pre_page_sga = true and bounced the instance:
[root@ol64 ~]# grep -i page /proc/meminfo AnonPages: 370632 kB PageTables: 15272 kB AnonHugePages: 206848 kB HugePages_Total: 1257 HugePages_Free: 3 HugePages_Rsvd: 3 HugePages_Surp: 0 Hugepagesize: 2048 kB [root@ol64 ~]#
Now all pages are allocated straight after instance start. If you want to follow the example, I suggest you use the watch command as shown here:
watch grep -i page /proc/meminfo
Summary and a bit of warning
I personally wouldn’t rely on use_large_pages = auto in an environment I care about. It’s simply too unpredictable that you get the large pages requested, and you might fall back into 4k page mode. Planning is better than hoping-calculate the number of large pages beforehand, add them to /etc/sysctl.conf in vm.nr_hugepages and you should be almost guaranteed to have them allocated. Large pages need enough contiguous memory or otherwise the allocation may (partially) fail.
Also – large pages cannot be swapped out during memory pressure. Don’t forget you still need enough space for the PGAs and operating system! If the system starts swapping although “free” shows a lot of free memory then most likely you are using up all the 4k pages in memory.
Outlook
There is even more to be said about large pages on systems with more than 2 sockets, especially when it comes to allocating large pages per NUMA node. I’ll leave that for a future post.
Oh and yes, the large page information in /proc is only a legacy, it’s now all in SYSFS in /sys/kernel/mm/hugepages/hugepages-2048kB. Intel x86-64 supports three different page sizes: 1GB, 2048kb and 4kb.
Reference
Dominic Brooks published a note recently about some very nasty SQL – originally thinking that it was displaying a run-time problem due to the extreme number of copies of the lnnvl() function the optimizer had produced. In fact it turned out to be a parse-time problem rather than a run-time problem, but when I first read Dominic’s note I was sufficiently surprised that I decided to try modelling the query.
Unfortunately the query had more than 1,000 predicates, (OR’ed together) and some of them included in-lists. Clearly, writing this up by hand wasn’t going to be a good idea, so I wrote a script to generate both the data, and the query, as follows – first a table to query:
create table t1
as
with generator as (
select --+ materialize
rownum id
from dual
connect by
level <= 1e4
)
select
rownum id1,
rownum id2,
rownum id,
lpad(rownum,10) v1,
rpad('x',100) padding
from
generator v1,
generator v2
where
rownum <= 1e5
;
create index t1_i1 on t1(id1, id2);
begin
dbms_stats.gather_table_stats(
ownname => user,
tabname =>'T1',
method_opt => 'for all columns size 1'
);
end;
/
Then a piece of code to write a nasty query:
set pagesize 0
set feedback off
set termout off
spool temp1.sql
prompt select * from t1 where 1 = 2
select
'or (id1 = ' || rownum || ' and id2 = ' || (rownum + 1) || ')'
from
t1
where
rownum <= 750
union all
select
'or ( id1 = ' || (rownum + 1000) || ' and id2 in (' || rownum || ',' || (rownum+1) || '))'
from
t1
where
rownum <= 250
;
prompt /
spool off
Here’s an example of the text generated by the code – with the parameters set to 5 and 3 respectively (and notice how I’ve rigged the query so that it doesn’t return any data, whatever the optimizer thinks):
select * from t1 where 1 = 2 or (id1 = 1 and id2 = 2) or (id1 = 2 and id2 = 3) or (id1 = 3 and id2 = 4) or (id1 = 4 and id2 = 5) or (id1 = 5 and id2 = 6) or ( id1 = 1001 and id2 in (1,2)) or ( id1 = 1002 and id2 in (2,3)) or ( id1 = 1003 and id2 in (3,4)) /
So here’s the plan from the above query:
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 1008 | 16 (0)| 00:00:01 |
| 1 | CONCATENATION | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | T1 | 1 | 126 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | T1_I1 | 1 | | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID | T1 | 1 | 126 | 3 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | T1_I1 | 1 | | 2 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID | T1 | 1 | 126 | 3 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | T1_I1 | 1 | | 2 (0)| 00:00:01 |
| 8 | INLIST ITERATOR | | | | | |
| 9 | TABLE ACCESS BY INDEX ROWID| T1 | 5 | 630 | 7 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | T1_I1 | 5 | | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("ID1"=1003)
filter("ID2"=3 OR "ID2"=4)
5 - access("ID1"=1002)
filter((LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)) AND
("ID2"=2 OR "ID2"=3))
7 - access("ID1"=1001)
filter((LNNVL("ID1"=1002) OR LNNVL("ID2"=2) AND LNNVL("ID2"=3)) AND
(LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)) AND ("ID2"=1 OR
"ID2"=2))
10 - access(("ID1"=1 AND "ID2"=2 OR "ID1"=2 AND "ID2"=3 OR "ID1"=3 AND
"ID2"=4 OR "ID1"=4 AND "ID2"=5 OR "ID1"=5 AND "ID2"=6))
filter((LNNVL("ID1"=1001) OR LNNVL("ID2"=1) AND LNNVL("ID2"=2)) AND
(LNNVL("ID1"=1002) OR LNNVL("ID2"=2) AND LNNVL("ID2"=3)) AND
(LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)))
As you can see, the first five predicates end up in line 10 of the plan with 10 repetitions (5 * 2) of the lnnvl() function. The last three predicates show up in lines 3, 5, and 7 – and the on each line we see two more lnnvl() calls than on the previous – just imagine, then, how many lnnvl() calls the optimizer will have added to the query plan by the time we have 750 occurrences in the inlist iterator (line 8) and 250 occurrences of the slightly complex predicate. Here are the relevant CPU stats (from v$session_stats) from running the generated script on 11.1.0.7, on Windows 32-bit, 2.8GHz CPU:
Name Value ---- ----- recursive cpu usage 1,848 CPU used when call started 1,854 CPU used by this session 1,854 DB time 1,870 parse time cpu 1,847 parse time elapsed 1,862
Clearly the parse time is extreme – though not as dramatic as in Dominic’s example; but having set up the first draft of the sample code it’s easy enough to change the number of occurrences of each type of predicate, and it’s pretty easy to make longer in-lists in the more complex of the two types of predicate. It’s not too difficult to get an execution plan that mimics Dominic’s in length and time to parse.
It’s not just the parse times that are interesting when you start doing this, by the way – it’s worth playing around to see what happens. It’s probably best to run the query to pull the plans from memory if you want to see the plans, though – if you try using “explain plan” then you start using memory in the SGA for some of the work: in one of my examples I had to abort the instance after a few minutes.
I’d like to dedicate this posting to fellow Oak Table member Richard Foote, for reasons that the readers we have in common will immediately recognise: http://www.youtube.com/watch?v=KaOC9danxNo
The singer is Canadian astronaut Commander Chris Hadfield who has been tweeting and posting pictures from space – be careful, you may get hooked: https://twitter.com/Cmdr_Hadfield/status/332819772989378560/photo/1
When I posted the link to the video it had received 1.5M views; less than 24 hours later it’s up to roughly 7M. (And they weren’t all Richard Foote). Clearly the images have caught the imagination of a lot of people. If you have looked at the twitter stream it’s equally inspiring – and not just for the pictures.
I’ve got an existing package called DEMO as below
SQL> create or replace package demo is 2 -- used to pass list of numbers 3 TYPE t_num_list IS TABLE OF NUMBER INDEX BY BINARY_INTEGER; 4 5 PROCEDURE p( 6 p_list1 IN t_num_list 7 ); 8 end; 9 / Package created.
but what I’d like to do is add another collection parameter to it, whilst keeping backward compatibility
I could use a overloaded version, or I can default that second parameter to null. Lets explore the second option:
SQL> create or replace package demo is 2 -- used to pass list of numbers 3 TYPE t_num_list IS TABLE OF NUMBER INDEX BY BINARY_INTEGER; 4 5 PROCEDURE p( 6 p_list1 IN t_num_list, 7 p_list2 IN t_num_list default null 8 ); 9 end; 10 / Warning: Package created with compilation errors. SQL> sho err Errors for PACKAGE DEMO: LINE/COL ERROR -------- --------------------------------------------------------------- 5/4 PL/SQL: Declaration ignored 7/54 PLS-00382: expression is of wrong type
Well that’s no good. What we need to do is ensure that the types are consistent, like this:
SQL> create or replace package demo is 2 -- used to pass list of numbers 3 TYPE t_num_list IS TABLE OF NUMBER INDEX BY BINARY_INTEGER; 4 5 PROCEDURE p( 6 p_list1 IN t_num_list, 7 p_list2 IN t_num_list default cast(null as t_num_list) 8 ); 9 end; 10 / Package created.
But just remember, in doing so, when you reference the parameter inside your package body, it is no longer null in the ‘conventional’ sense, as the following little example demonstrates
SQL> create or replace package body demo is
2
3 PROCEDURE p(
4 p_list1 IN t_num_list,
5 p_list2 IN t_num_list default cast(null as t_num_list)
6 ) is
7 begin
8 if p_list2 is null then
9 dbms_output.put_line('I am null');
10 end if;
11
12 if p_list2.count = 0 then
13 dbms_output.put_line('I am empty');
14 end if;
15 end;
16 end;
17 /
Package body created.
SQL> set serverout on
SQL> declare
2 x demo.t_num_list;
3 begin
4 demo.p(x,x);
5 end;
6 /
I am empty
PL/SQL procedure successfully completed.
On Thursday I’ll be flying out to Bulgaria for BGOUG Spring 2013. It’s been about 18 months since I’ve visited the people over there, so I’m really looking forward to getting stuck in.
This will be my first conference of the year, so I’m feeling a little nervous at the moment. I’m sure the adrenalin rush will kick in and get me through.
I’m signed up for the southern leg of the OTN Tour of Latin America (Chile, Peru, Uruguay, Argentina, Brazil), but it will be a while before I get any confirmation, so there are no guarantees yet.
Fun, fun, fun…
Cheers
Tim…
I wrote a couple of days ago about replacing my MacBook Pro hard drive with SSD. At the same time I bought a little SSD to use as the system drive for my desktop. I fitted that this morning, installed a fresh copy of Fedora 18 and mounted the original 1TB hard drive as a data drive.
Like the MacBook Pro, my desktop is a few years old, but still has plenty of grunt (Quad Core and 8G RAM) for what I need it for. I do run the odd VM on it, but any heavy stuff is run on my server, so there is no incentive to go out an buy the latest kit for what is essentially just a client PC.
The addition of the SSD means the start up time is a much better and it just feels a lot more responsive. Most apps start up almost instantly. Even GIMP, which used to take an age to start, is mega quick. I’ve put a couple of VMs on it and not surprisingly, they are fast to start up too. Overall I’m really pleased with the outcome.
The funny thing is, I never noticed how noisy spinning rust was until I switched to these SSDs. The Mac is silent and runs for a lot longer before the fan kicks in. The desktop is also silent, until I pull something from the data disk, at which point I hear that slight grinding noise.
I don’t think I would invest in large capacity SSDs for home until the prices drop considerably, but having witnessed the before and after results on these two old machines, I can’t imagine ever running without an SSD system disk again.
Cheers
Tim…
PS. While I was reconfiguring my desktop I tried out Dnsmasq. Much simpler than BIND.
Update: I worked through some of the suggestions here to enable TRIM support and reduce wear.
A couple of weeks ago I started a competition to win 2 copies of Oracle E-Business Suite R12 Integration and OA Framework Development and Extension Cookbook by Andy Penver. Thanks to Packt for donating the prizes. The competition closed yesterday and the lucky winners are:
I’ve sent your email addresses to my contact at Packt, who will contact you to deliver your e-book.
Cheers
Tim…
The above is so cool.
The graphic shows the latency heatmap of “log file sync”. I was running a swingbench load and at the same time throttling I/O such that latencies started off good then got worse and then back to normal.
All I did was type
sqlplus / as sysdba @OraLatencyMap_event 3 "log file sync"
This was created by Luca Canali , see http://externaltable.blogspot.com/2013/05/latency-heat-map-in-sqlplus-with.htm
Now if we combine this monitoring, with the I/O throttling documeted by Frits Hoogland here https://fritshoogland.wordpress.com/2012/12/15/throttling-io-with-linux/ , we can really have some fun and even draw latency words:
Run an auto refresh color coded heatmap on “log file sync” in sqlplus by typing
sqlplus / as sysdba @OraLatencyMap_event 3 "log file sync"
where OraLatencyMap_event.sql and OraLatencyMap_internal.sql are your current directory or sqlpath
Now to play with LGWR latency with cgroup throttles see
https://fritshoogland.wordpress.com/2012/12/15/throttling-io-with-linux/
# install cgroups on 2.6.24 LINUX or higher yum intall cgroup # setup /cgroup/blkio grep blkio /proc/mounts || mkdir -p /cgroup/blkio ; mount -t cgroup -o blkio none /cgroup/blkio cgcreate -g blkio:/iothrottle # find the device you want df -k # my Oracle log file divice was ls -l /dev/mapper/vg_source-lv_home lrwxrwxrwx. 1 root root 7 May 1 21:42 /dev/mapper/vg_source-lv_home -> ../dm-2 # my device points to /dev/dm-2 ls -l /dev/dm-2 brw-rw----. 1 root disk 253, 2 May 1 21:42 /dev/dm-2 # my device major and minor numbers are "253, 2" # create a write throtte on this device (for read just replace "write" with "read" # this limits it to 10 writers per second cgset -r blkio.throttle.write_iops_device="253:2 10" iothrottle # look for lgwr ps -ef | grep lgwr oracle 23165 1 0 13:35 ? 00:00:19 ora_lgwr_o1123 # put lgwr pid into throttle group echo 23165 > /cgroup/blkio/iothrottle/tasks # now play with different throttles cgset -r blkio.throttle.write_iops_device="253:2 1" iothrottle cgset -r blkio.throttle.write_iops_device="253:2 10" iothrottle cgset -r blkio.throttle.write_iops_device="253:2 100" iothrottle cgset -r blkio.throttle.write_iops_device="253:2 1000" iothrottle # if you are finished then delete the throttle control group cgdelete blkio:/iothrottle
Recent comments
16 weeks 5 days ago
26 weeks 3 days ago
28 weeks 1 day ago
31 weeks 3 days ago
33 weeks 5 days ago
43 weeks 1 day ago
44 weeks 5 days ago
45 weeks 5 days ago
45 weeks 6 days ago
48 weeks 4 days ago