Thursday, June 13, 2013

Today I had to perform batch partition creation. Unfortunately for me the partitions are range based and they want me to create partitions that are below the high bound. This means that I couldn't specify a range for the stored procedure because it would throw ORA-14080 . Instead I created a shell script that did all of the work for me:


while [ $INDEX -le 160 ]

DAY2=`expr $DAY2 - 1`
DAY=`expr $DAY2 - 1`
INDEX=`expr $INDEX + 1`

PAD=`printf "%03d" $DAY`
PAD2=`printf "%03d" $DAY2`

echo "ALTER TABLE BAF_CDRS_LERG" >> split.sql
echo "SPLIT PARTITION Y2012_D$PAD2\_S2"  >> split.sql
echo "AT (TO_DATE('2012$PAD', 'YYYYDDD'), 3)"  >> split.sql
echo "INTO (PARTITION Y2012_D$PAD\_S2, PARTITION Y2012_D$PAD2\_S2);"  >> split.sql
echo '' >> split.sql


sed 's/\\//g' split.sql > splitfinal.sql


Note that the DAY and DAY2 variables are the base settings for those values. Since we use julian days I use the PAD and PAD2 variables to add 0 and 00 padding respectively. Index is the amount of increment. I had to use the \ as an escape character so that the shell would interpret the variables without error. The sed command removes the escape character from the final script which is only there because it's within an echo.

Monday, October 8, 2012

OCS Listener Start

Oracle Cluster Services is complex for no reason. I discovered while troubleshooting our RAC that the server wasn't starting up automatically because OCS wasn't being allowed to start the listener. To manually start the listener via OCS I used this command:


I also ran a command that permanently added the listener start to the OCS startup but I can't remember what it was for the life of me. If anyone knows please comment!

RAC listener config

I discovered something important regarding the configuration of a RAC listener. Each node must have it's own VIP address listed for the listener. In this case we had 2 regular connections for the servers and then 2 VIP addresses for the nodes. All of this was configured in the /etc/hosts files and then mirrored to the listener.ora and tnsnames.ora. It still doesn't resolve my intermittent ORA-600 loader drop but it at least gets things running as they should be.

        (ADDRESS = (PROTOCOL = TCP)(HOST = NODE1-vip)(PORT = 1521)(IP = FIRST))
        (ADDRESS = (PROTOCOL = TCP)(HOST = NODE2-vip)(PORT = 1521)(IP = FIRST))

Tuesday, September 25, 2012

Using enterprise manager I was able to determine how to manipulate the scheduler via command line:



value=>numtodsinterval(420, 'minute'));


It's been far too long! So here's a little somethin'. Fatal NI disconnect due to timeout on a beq connection. I suspect that the server is extremely busy and that the connection just times out due to lack of server resources. The solution was easier than it used to be:

[oracle@slinc-nams] /home/oracle> lsnrctl

LSNRCTL for Linux: Version - Production on 25-SEP-2012 10:34:52

Copyright (c) 1991, 2007, Oracle.  All rights reserved.

Welcome to LSNRCTL, type "help" for information.

LSNRCTL>  set inbound_connect_timeout 0
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
LISTENER parameter "inbound_connect_timeout" set to 0
The command completed successfully

Wednesday, April 18, 2012

Reclaiming SYSAUX

SYSAUX was growing out of control.
SQL> select dbms_stats.get_stats_history_retention from dual;

     Take a full backup of the database 

SQL> alter table WRI$_OPTSTAT_TAB_HISTORY move;
SQL> alter table WRI$_OPTSTAT_OPR move;
SQL> alter table WRI$_OPTSTAT_IND_HISTORY move;
SQL> alter table WRI$_OPTSTAT_AUX_HISTORY move;
SQL> select index_name from dba_indexes where status='UNUSABLE';

SQL> set long 10000
SQL> set pagesize 10000
SQL> select dbms_metadata.get_ddl('INDEX','I_WRI$_OPTSTAT_IND_OBJ#_ST','SYS') from dual;
Throw the below portion in an SQL script. It will free up very little. 
alter table WRI$_OPTSTAT_TAB_HISTORY deallocate unused;
alter table WRI$_OPTSTAT_OPR deallocate unused;
alter table WRI$_OPTSTAT_IND_HISTORY deallocate unused;
alter table WRI$_OPTSTAT_HISTHEAD_HISTORY deallocate unused;
alter table WRI$_OPTSTAT_HISTGRM_HISTORY deallocate unused;
alter table WRI$_OPTSTAT_AUX_HISTORY deallocate unused; 

alter index I_WRI$_OPTSTAT_TAB_OBJ#_ST deallocate unused;
alter index I_WRI$_OPTSTAT_OPR_STIME deallocate unused;
alter index I_WRI$_OPTSTAT_IND_ST deallocate unused;
alter index I_WRI$_OPTSTAT_H_ST deallocate unused;
alter index I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST deallocate unused;
alter index I_WRI$_OPTSTAT_HH_ST deallocate unused;
alter index I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST deallocate unused;
alter index I_WRI$_OPTSTAT_IND_OBJ#_ST deallocate unused;
alter index I_WRI$_OPTSTAT_IND_OBJ#_ST deallocate unused;

Thursday, March 1, 2012

WARNING: Subscription for node down event still pending

If you're seeing the above error in the listener log file for a single instance database then alter the listener.ora and add the following parameter:


By default, the listener subscribes to the Oracle Notification Service (ONS) node down event on startup, if ONS is available. This subscription enables the listener to remove the affected service when it receives node down event notification from ONS. The listener uses asynchronous subscription for the event notification. This is not needed for single instance databases.

Wednesday, February 22, 2012

Archivelog Alternate Destination

If the archivelog destination of an Oracle Database fills up to 100% then the database halts until there is space available. If you have a high availability database then it's probably a good idea to give the database a secondary location to write to in the event of filled disks.


In this example I will change the LOG_ARCHIVE_DEST_# and LOG_ARCHIVE_STATE_# parameters to enable a secondary location. Note that I had to specify "LOCATION=". If you aren't writing to disk then obviously these would look a little bit different.

Also note that at the end of the DEST_1 parameter value that "ALTERNATE" is specified. This alternate value dictates which destination will be written to in case of failure. You can string multiple locations together like this.

Finally you will also notice that the STATE_# parameters are set to ENABLE and ALTERNATE (respectively). If you have multiple alternate locations then they will be marked alternate. Please note that the MANDATORY flag is needed so that the database considers the location viable for storing the logs.

Thursday, February 16, 2012

Shutting Down RDBMS, ASM, EM, Listener

sqlplus "/ as sysdba"
shutdown immediate (shutdown abort if you're hung)

sqlplus "/ as sysASM"
shutdown immediate

$ORACLE_HOME/bin/emctl stop dbconsole
$ORACLE_HOME/bin/lsnrctl stop

su - root
$ORACLE_HOME/bin/crsctl stop crs

SQL Tuning Advisor

While you may be familiar with the Enterprise Manager's Graphical User Interface it is not always available. For instance some remote connections may not allow you to access the Enterprise Manager ports. There are several reasons that you may not be able to use Enterprise Manager to tune your SQL with a few mouse clicks so here is how it's done manually in an SQL*Plus session.

First off this information can be found within the Oracle Knowledge Base
NOTE: 271196.1 - Automatic SQL Tuning (SQL Profiles)
NOTE: 262687.1 - How to use the SQL Tuning Advisor

Obviously we're tuning specific SQL statements that are problematic. I usually uncover the offending SQL using ASH, AWR and ADDM reports. ASH will probably enough for most basic needs. You can usually find the ASH report file in $ORACLE_HOME/rdbms/admin/ashrpt.sql. That's also where AWR and ADDM (among others) reside. I recommend using the HTML output as it makes it easier to read. Download the report and find your most problematic SQL that can be tuned. Grab SQL_ID or HASH_ID and write it down.

Now that we know the SQL_ID that we would like to run through the advisor we should grant the ability to use the advisor to our application user:

sqlplus "/ as sysdba"
grant advisor, select_catalog_role to <app_user>;
grant execute on DBMS_SQLTUNE to <app_user>;

sqlplus <appl_user>/<password>
my_task_name VARCHAR2(30);
my_sqlid     VARCHAR2(30);
my_sqlid := '&sql_id';
        sql_id      => my_sqlid,
        scope       => 'COMPREHENSIVE',
        time_limit  => 600,
        task_name   => 'my_task_1',
        description => 'Task to tune a query');

Once you execute the above statement you will be prompted for the SQL_ID. Enter it now. At this point you've now created the task that will examine the SQL and run it through the advisor. Time to execute it.

 DBMS_SQLTUNE.EXECUTE_TUNING_TASK( task_name => 'my_task_1');

Now we've successfully ran the SQL statement through the advisor and generated result. The result is very long and takes a little formatting to be readable. Set long, linesize, pagesize, etc in order to read it clearly. It should have a recommendation for improved performance and the steps involved to implement the changes.

If a new profile needs to be created then here is an example of that command:

my_sqlprofile_name VARCHAR2(30);
my_sqlprofile_name := DBMS_SQLTUNE.ACCEPT_SQL_PROFILE (
        task_name => 'my_task_1',        name => 'my_sql_profile_1',        category => 'DEFAULT');

Wednesday, February 15, 2012

Resolving Stale Statistics

I recently had an 11g database performing very poorly. It was fresh out of the box so I went ahead and performed the usual routine of I/O tuning and session reporting. When I ran what looked to be like problematic SQL through the tuning adviser I got a surprise. It told me that my statistics were stale. I allow Oracle to gather statistics automatically every night within the default window.

First let's make sure that stats are setup to run:

select client_name, status from dba_autotask_client;
show parameter statistics_level

Now let's go ahead and check that the windows are enabled:

select w.window_name, w.repeat_interval, w.duration,w.enabled
from dba_autotask_window_clients c
, dba_scheduler_windows w
where c.window_name = w.window_name

and c.optimizer_stats = 'ENABLED';

Ok so it does look like everything is setup to run. You can check for successful runs and investigate your scheduling more closely but it doesn't look like that's the problem. For now we'll assume that gather_stats is working as expected. Now let's see if one of our indexes or tables have gone stale:

set pages 100
set lines 100
col time format a20
col systime format a20

select to_char(SYSDATE, 'mm/dd/yy hh24:mi:ss') systime,
table_name, stale_stats, to_char(LAST_ANALYZED, 'mm/dd/yy hh24:mi:ss') time
from user_tab_statistics
where table_name not like ('BIN$%') AND STALE_STATS = 'YES'

select to_char(SYSDATE, 'mm/dd/yy hh24:mi:ss') systime,
index_name, stale_stats, to_char(LAST_ANALYZED, 'mm/dd/yy hh24:mi:ss') time
from user_ind_statistics
where table_name not like ('BIN$%') AND STALE_STATS = 'YES'

-------------------- ------------------------------ --- --------------------
02/06/12 09:25:09 SHAGGYDAD_LINKS_IDX YES 01/26/12 18:25:36
02/06/12 09:25:09 SHAGGYDAD_LINKS_2_IDX YES 01/26/12 18:26:01

And there we have it! The offending indexes are listed. Now we still have a problem. The indexes listed are not only massive but luckily they are also partitioned. So instead of waiting for who knows how long in order to gather statistics on the entire index we can find out which partition is having issues:

set lines 2000
select * from user_ind_partitions
where index_name in ('SHAGGYDAD_LINKS_2_IDX','SS7_TDR_BAF_CDR_LINKS_IDX');

where index_name in ('SHAGGYDAD_LINKS_2_IDX','SS7_TDR_BAF_CDR_LINKS_IDX');

select * from user_ind_statistics
where table_name not like ('BIN$%') AND STALE_STATS = 'YES'

The output here can be pretty confusing. From the output you will hopefully see a small list of partitions and/or subpartitions on the indexes that are stale. If that's the case then just collect stats on those partitions:

ownname => 'SHAGGYDAD',
tabname => 'SHAGGYDAD_LINKS',
partname => 'Y2012_D036_S3',
estimate_percent => 10,

granularity => 'ALL',
cascade => TRUE);

Gather stats for every partition listed and you should see everything return to normal. If the list is too large or unpartitioned (or the stats are COMPLETELY stale) then just gather statistics on the entire index. It will take a while but after that the default ESTIMATE_PERCENT is 10% so keep that in mind. For instance if I want to gather statistics when the table is 2% stale instead of 10% stale then I would issue this command:

exec dbms_stats.set_table_prefs(null,'TABLE_NAME','STALE_PERCENT',2 );

Wednesday, January 11, 2012


This information is garnered DIRECTLY from Oracle Documentation. I did not write this and I am not claiming credit for it.


What is a Checkpoint?
A Checkpoint is a database event which synchronizes the modified data blocks in memory with the datafiles on disk. A checkpoint has two purposes: (1) to establish data consistency, and (2) enable faster database recovery. The checkpoint must ensure that all the modified buffers in the cache are really written to the corresponding datafiles to avoid the loss of data which may occur with a crash (instance or disk failure).
Checkpoints and Performance
Checkpoints present a tuning dilemma for the Database Administrator.  Frequent
checkpoints will enable faster recovery, but can cause performance
degradation. How then should the DBA address this? Depending on the number of datafiles in a database, a checkpoint can be a
highly resource intensive operation, since all datafile headers are frozen
during the checkpoint.  There is a performance trade-off regarding frequency
of checkpoints.  More frequent checkpoints enable faster database recovery
after a crash.  This is why some customer sites which have a very low
tolerance for unscheduled system downtime will often choose this option.
However, the performance degradation of frequent checkpoints may not justify
this philosophy in many cases. Let's assume the database is up and running 95%
of the time, and unavailable 5% of the time from infrequent instance crashes
or hardware failures requiring database recovery.  For most customer sites, it
makes more sense to tune for the 95% case rather than the rare 5% downtime.
This bulletin assumes that performance is your number one priority and so
recommendations are made accordingly. Therefore, your goal is to minimize the frequency
of checkpoints through tuning.

Tuning checkpoints involves four key initialization parameters
Note: Log file switches will always override checkpoints caused by following paarameters.

    Since Oracle 9i FAST_START_MTTR_TARGET parameter is the preferred method
    of tuning incremental checkpoint target. FAST_START_MTTR_TARGET enables you
    to specify the number of seconds the database takes to perform crash recovery
    of a single instance. Based on internal statistics, incremental checkpoint
    automatically adjusts the checkpoint target to meet the requirement of
    V$INSTANCE_RECOVERY.ESTIMATED_MTTR shows the current estimated mean time to
    recover (MTTR) in seconds. This value is shown even if FAST_START_MTTR_TARGET
    is not specified.
    V$INSTANCE_RECOVERY.TARGET_MTTR shows the effective MTTR target in seconds
    enforced by the system.
    V$MTTR_TARGET_ADVICE shows the number of I/Os resulted by the current workload
    under the current MTTR setting and the estimated number of I/Os that would be
    resulted by the current workload under other MTTR settings. This view helps
    the user to assess the trade-off between runtime performance and setting
    FAST_START_MTTR_TARGET to achieve better recovery time.
    LOG_CHECKPOINT_INTERVAL parameter specifies the maximum number of redo blocks
    the incremental checkpoint target should lag the current log tail.
    be set or set to 0.
    On most Unix systems the operating system block size is 512 bytes.
    This means that setting LOG_CHECKPOINT_INTERVAL to a value of 10,000 would
    mean the incremental checkpoint target should not lag the current log tail
    by more than 5,120,000 (5M) bytes. .  If the size of your redo log is 20M, you are taking 4
    checkpoints for each log. LOG_CHECKPOINT_INTERVAL influences when a checkpoint occurs, which means
    careful attention should be given to the setting of this parameter, keeping it
    updated as the size of the redo log files is changed.  The checkpoint
    frequency is one of the factors which impacts the time required for the
    database to recover from an unexpected failure.  Longer intervals between
    checkpoints mean that if the system crashes, more time will be needed for the
    database to recover.  Shorter checkpoint intervals mean that the database will
    recover more quickly, at the expense of increased resource utilization during
    the checkpoint operation.
    This parameter also impacts the time required to complete a database recovery
    operation during the roll forward phase of recovery.  The actual recovery time
    is dependent upon this time, and other factors, such as the type of failure
    (instance or system crash, media failure, etc.), and the number of archived
    redo logs which need to be applied.
    The LOG_CHECKPOINT_TIMEOUT parameter specifies the maximum number of seconds
    the incremental checkpoint target should lag the current log tail.
    In another word, it specifies how long a dirty buffer in buffer cache can
    remain dirty.

    Checkpoint frequency impacts the time required for the
    database to recover from an unexpected failure.  Longer intervals between
    checkpoints mean that more time will be required during database recovery. Oracle recommends using LOG_CHECKPOINT_INTERVAL to control the checkpoint
    interval rather than LOG_CHECKPOINT_TIMEOUT, which will initiate a checkpoint
    every "n" seconds, regardless of the transaction frequency.  This can cause
    unnecessary checkpoints in cases where transaction volumes vary.  Unnecessary
    checkpoints must be avoided whenever possible for optimal performance.
    It is a misconception that setting LOG_CHECKPOINT_TIMEOUT to a given
    value will initiate a log switch at that interval, enabling a recovery
    window used for a stand-by database configuration.   Log switches cause a checkpoint,but a checkpoint does not cause a log switch.  The only way to cause a log switch is manually with
    ALTER SYSTEM SWITCH LOGFILE or resizing the redo logs to cause
    more frequent switches.  This is controlled by operating system
    blocks, not a timed interval.
    Sizing of the online redo logs is critical for performance and recovery.
    See additional sections below on redo logs and checkpoints.
    LOG_CHECKPOINTS_TO_ALERT lets you log your checkpoints to the alert file.
    Doing so is useful for determining whether checkpoints are occurring at
    the desired frequency.
    Prior to Oracle9i this parameter was STATIC.
    Oracle generally advises this be set to TRUE as the overhead is
    negligible but the information in the alert log may be useful.

    See Note:76713.1 to have more detail on How those instance parameters can influence the checkpoint.

Redo logs and Checkpoint
A checkpoint occurs at every log switch.  If a previous checkpoint is already
in progress, the checkpoint forced by the log switch will override the current
This necessitates well-sized redo logs to avoid unnecessary checkpoints as a
result of frequent log switches. 
The lag between the incremental checkpoint target and the log tail is
also limited by 90% of the smallest online log file size. This makes sure
that in most cases log switch would not need to wait for checkpoint.
Because of this, log file sizes should be configured large enough.
A good rule of thumb is to switch logs at most every twenty minutes.
Having your log files too small can increase checkpoint activity and reduce performance.
Oracle recommends the user to set all online log files to be the same size,
and have at least two log groups per thread.   The alert log is a valuabletool for
monitoring the rate that log switches occur, and subsequently, checkpoints
The following is an example of quick log switches
from the alert log:
Fri May 16 17:15:43 1997
Thread 1 advanced to log sequence 1272
  Current log# 3 seq# 1272 mem# 0: /prod1/oradata/logs/redologs03.log
Thread 1 advanced to log sequence 1273
  Current log# 1 seq# 1273 mem# 0: /prod1/oradata/logs/redologs01.log
Fri May 16 17:17:25 1997
Thread 1 advanced to log sequence 1274
  Current log# 2 seq# 1274 mem# 0: /prod1/oradata/logs/redologs02.log
Thread 1 advanced to log sequence 1275
  Current log# 3 seq# 1275 mem# 0: /prod1/oradata/logs/redologs03.log
Fri May 16 17:20:51 1997
Thread 1 advanced to log sequence 1276
  Current log# 1 seq# 1276 mem# 0: /prod1/oradata/logs/redologs01.log
If redo logs switch every 3 minutes, you will see performance degradation.
This indicates the redo logs are not sized large enough to efficiently handle
the transaction load.
See Note:1038851.6 for further detail on how to estimate an adequate size of the redolog files. See Note:1035935.6 Example of How To Resize the Online Redo Logfiles

Understanding Checkpoint Error messages  (Cannot allocate new log and Checkpoint not complete
Sometimes, you can see in your alert.log file, the following corresponding
messages:   Thread 1 advanced to log sequence 248
    Current log# 2 seq# 248 mem# 0: /prod1/oradata/logs/redologs02.log
  Thread 1 cannot allocate new log, sequence 249
  Checkpoint not complete
This message indicates that Oracle wants to reuse a redo log file, but
the current checkpoint position is still in that log. In this case, Oracle must
wait until the checkpoint position passes that log. Because the
incremental checkpoint target never lags the current log tail by more than 90%
of the smallest log file size, this situation may be encountered if DBWR writes
too slowly, or if a log switch happens before the log is completely full,
or if log file sizes are too small.
When the database waits on checkpoints,redo generation is stopped until the
log switch is done.

Using Statspack to determine Checkpointing problems

Statspack snapshots can be taken every 15 minutes or so, these reports gather useful
information about number of checkpoints started and checkpoints completed and number
of database buffers written during checkpointing for that window of time . It also contains
statistics about redo activity. Gathering and comparing these snapshot reports gives you
a complete idea about checkpointing performance at different periods of time.
Another important thing to watch in statspack report is the following wait events,
they could be a good indication about problems with the redo log throughput and checkpointing:

log file switch (checkpoint incomplete)
log file switch (archiving needed)
log file switch/archive
log file switch (clearing log file)
log file switch completion
log switch/archive
log file sync

In the case when one or more of the above wait events is repeated frequently
with considerable values then you need to take an action like adding More
online redo log files or increasing their sizes and/or modifying checkpointing parameters.

Thursday, January 5, 2012

Optimizer Index Cost

After a year of being the sole DBA here I learned something incredibly useful that I wish I had understood long ago. Almost all of my posts that involve me fighting performance issues are resolved by this one little parameter.


Many years ago when I first started exploring parameters I was told not to mess with this one because it was already setup exactly as needed. Many of our parameters are like that. Set them once and deploy them like that 100x without any problems. This little bastard though...

What does this do? It's a parameter that controls the optimizer's preference of Index vs Full Table Scan. It has a range of 1-10000 and defaults to 100. Apparently setting it to 100 causes the optimizer to think that both explain plans would have equal fetch times... I'm not sure which engineer decided that should be the default but I am sure that he was hitting himself in the head with a hammer when he considered it a good idea.

We have 1-10 indexes for each data set that we import and nearly an infinite number of data sets. After setting this parameter to 50 my optimizer started using the indexes exclusively and now my longest running SQL isn't even considered problematic anymore. Yipee!

Apparently 1-10,000 is the 'cost' differential. So when the optimizer looks at explain plan of FTS vs Index they each have a cost represented as a numeric value 0-10000. This parameter is a baseline cost. The previous DBA here used to call this 'Oracle Dollars'. Actually he was Asian so it was more like 'Oracle Doroos'. In any case I finally understand what he meant by that as each SQL in the database that gets ran through optimizer has this 'cost' associated with it.

Playing with the optimizer is not something I'm very familiar with but that's my understanding as of right now :)

It's the Oracle Cost Based Optimizer that applies to any SQL running in the database. The hand tuning would be to rewrite the SQL to duplicate the most effective explain plan. I've never heard of disabling the CBO but I'm sure it can be done.

Wednesday, October 5, 2011

Database Upgrade

When major patch sets or database releases are put into production we completely ignore them. The only time we consider patches or new releases important is when there is a direct need for them. For instance I have a system with a corrupt tablespace. The fix for this specific issue was to upgrade the database to the next major revision. No big deal.

We have to perform silent installations where I work because we only have SSH access to the systems. Please note that we're running on Linux (always). So here are the steps involved with a silent installation and the cause and fix for the following errors:
ORA-00704: bootstrap process failure
ORA-39700: database must be opened with UPGRADE option

Set up your environment ORACLE_SID, ORACLE_HOME, ORACLE_BASE, the usual shit. You know what, if you don't have this part figured out by now then don't bother with the upgrade. Next!

Next you unzip your patch file and jump into the Disk1/Install/response directory. You should see a single (or multiple) response files. Pick the one that sis pertinent and edit it. This part should be pretty straight forward. Note that if you have an existing name for your Oracle Home that they must match. If you don't know what it is then it will tell you when it's ran (It errors out with: Oracle home name XXXX already exists at the Oracle Home you specified). Why they did not automate this I have no clue. Once you're satisfied with all of the entries move on.

Now we need to shut down EVERYTHING that's related to that home. dbconsole, CRS, ASM, Listener and RDBMS. Details.

$ORACLE_HOME/bin/emctl stop dbconsole

sqlplus "/ as sysdba"
shutdown immediate

sqlplus "/ as sysASM"
shutdown immediate

$ORACLE_HOME/bin/crsctl stop crs (as ROOT)

EMCTL dbconsole stop


Now everything is shutdown and you have your patch unzipped and the response file ready. Change your directory to the runInstall script and kick it off silently:

./runInstaller -silent -responseFile /u01/Disk1/response/patchset.rsp

Once that completes you ARE NOT FINISHED. I made the mistake of thinking that I was. There are still 2 very important steps. If you attempt to start your database you get bootstrap and forced disconnection errors. The traces generated say that you have to open the database in UPGRADE mode. What they don't mention is that you skipped a step.

Start everything up (dbconsole, CRS, ASM, Listener and RDBMS) and then:

sqlplus "/ as sysdba"
startup UPGRADE
shutdown immediate

And voila! You are finished. Well done you crazy DBA you! I only say that because reading how to upgrade a database from me is terrifying. Go flog yourself.

Monday, October 3, 2011

Oracle Enterprise Manager

I deploy my databases without Enterprise Manager installed. To create a cookie cutter quick and easy tarball deployment with a configured EM/dbconsole would take me far longer than it's worth. So instead I've streamlined the steps involved with uninstalling and reinstalling EM/dbconsole. I have only tested the following steps on 10/11g.

Let's drop the preliminary repository drop first:
sqlplus "/ as sysdba"
alter user sys identified by oracle;
alter user system identified by oracle;
drop user sysman cascade;
drop role MGMT_USER;
drop user MGMT_VIEW cascade;
drop public synonym MGMT_TARGET_BLACKOUTS;
drop public synonym SETEMVIEWUSERCONTEXT;

Now we go about creating the password file:
sqlplus "/ as sysdba"
alter system set remote_login_passwordfile=exclusive scope=spfile;
!orapwd file=$ORACLE_HOME/dbs/orapw$ORACLE_SID password=oracle entries=5 

Okay, now to do some more dropping:
$ORACLE_HOME/bin/emca -deconfig dbcontrol db
$ORACLE_HOME/bin/emca -deconfig dbcontrol db -repos drop
rm -r $ORACLE_HOME/oc4j/j2ee/OC4J_DBConsole_$HOSTNAME\_nams

And finally time to create:
$ORACLE_HOME/bin/emca -config dbcontrol db -repos create 

Now lets start it up!
emctl dbconsole start
check /etc/ for outgoing server.

Thursday, September 29, 2011

Linux Kernel Parameters

Linux Kernel Parameters

SHMMAX - kernel parameter controlling
maximum size of one shared memory segment
SHMMNI - kernel parameter controlling
maximum number of shared memory segments
in the system
SHMSEG - kernel parameter controlling
maximum number of shared memory segments
a process can attach
SEMMNS - kernel parameter controlling
maximum number of semphores in the system
SEMMNI - kernel parameter controlling
maximum number of semaphore sets.
Semphores in Unix are allocated in sets of
1 to SEMMSL.
SEMMSL - kernel parameter controlling
maximum number of semaphores in a semphore set.
SHMLBA - kernel parameter controlling
alignment of shared memory segments; all
segments must be attached at multiples of
this value. Typically, non-tunable.

Tuesday, August 9, 2011

Oracle RMAN Tuning

So I'm still in combat with the RMAN backups. I've increased performance with several methods mentioned in older posts. Now it's time to cover channels and maybe even a little bit about backup sets. First off Oracle defines an RMAN channel as a direct connection between the database and the device backup is being written to. Each channel represents a level of parallelism and can be seen as a unique process on the operating system.

The default number of channels is 1. This means there is a single connection between the database and the disks the backup is writing to. If you are on a single hard drive and you're writing to a tape then is probably okay. On large systems with 16 processors, 16gb of RAM and 32 physical disks we can obviously bump up the number of channels. Remember that this should only be done if you are NOT CPU bound by backups as spawning a server process that is doing intense work per channel can cause your system load to skyrocket if you aren't careful.

I am I/O bound and I'm sure that's the case for many of you out there reading this. It seems like I/O on very large systems is the main offender of most performance issues. There are clever ways around it but none of them are simple or even obvious. I will be doing a great deal of channel testing over the next couple of days.

For those of you truly interested there is a lot of math involved in getting everything correct. Here is a bit of documentation: RMAN Documentation

I set my channels to 10 by setting the RMAN parameter:

old RMAN configuration parameters:
new RMAN configuration parameters:
new RMAN configuration parameters are successfully stored   

In my case I have 477 datafiles and a parallelism of 10. Rounded up that is 48 files per backup set. RMAN compares 48 to 64 and chooses 48 as the filesperset parameter. 477/48=10 (rounded up). So I'll have 10 backup sets each containing 48 files per set and hopefully producing a single piece. You can also force the basic and advanced compression algorithms by changing the above parameters so be careful. I'll post more as I learn it.

So now it looks a little something like this:

RMAN> show all;

using target database control file instead of recovery catalog
RMAN configuration parameters for database with db_unique_name #### are:
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/111/dbs/snapcf_nams.f'; # default

NOTE: This caused my full level 0 backup to take 2-3 hours instead of 12.

Tuesday, August 2, 2011

Block Change Tracking (RMAN)

As we all know I combat RMAN regularly. It's a vicious cycle of never ending hate and confusing parameters. Well, I've discovered a way to speed up the backups. Specifically incremental RMAN backups.

First off you need to know how your IO is. As root use iostat to see your reads versus your writes and how high the numbers are for your backup partitions. One thing that you should notice is that your reads are drastically higher than your writes. This is due to Oracle scanning every block of the database in an attempt to detect changes and then write them to backups. You can completely eliminate this preliminary read by implementing a block change tracking file. It's very easy and quick.

Now we're at the point of altering the database which will force it to create a binary change file and update block changes to it:

SQL> alter database enable block change tracking using file '/u01/oradata/nams/change_tracking.f';
SQL> alter database enable block change tracking using file '/u01/oradata/nams/change_tracking.f' resuse;

Alternately if you have DB_CREATE_FILE_DEST parameter set for Oracle Managed Files then you just issue an enable command:

SQL> alter database enable block change tracking;

In this case Oracle will create the change file in whatever directory you specified Oracle Managed Files (OMF) to read/write to. If DB_CREATE_FILE_DEST parameter is not set you will receive an error saying that you must specify the file name.

Now that you have enabled block change tracking it's time to make sure that it is working as intended. Information about the block change tracking file is kept in a view called: v$block_change_tracking. This view always contains a single record. If the status is ENABLED then the FILENAME contains the name of the file being used for tracking and BYTES contains the size of the file (in bytes obviously).

SQL> desc v$block_change_tracking

 Name                  Type
 ------------------ ---------------------
 STATUS              VARCHAR2(10)
 FILENAME        VARCHAR2(513)
 BYTES                NUMBER

SQL> select status,filename from v$block_change_tracking;

---------------- --------------------------------------------

ENABLED         /u01/oradata/nams/change_tracking.f    

The v$backup_datafile view contains a column named used_change_tracking. A value of yes in this column means that Oracle RMAN used block change tracking to speed up incremental backups. This will help identify how effective block change tracking was in improving backup duration and I/O.

Monday, August 1, 2011

Oracle Support

As anyone that has ever had the pleasure of dealing with Oracle Support can attest their services leave a lot to be desired. First off it's impossible to talk to someone right away. Service Orders must be opened online. Over that if you select the incorrect department or have a question that oversteps their knowledge by the merest fraction then a new SR is created in a different department and you have to start all over. Even high severity Service Requests will sometimes go over a week without a response. Well I'm fed up with it and I've discovered some tips and tricks that help navigate the maze of endless Habib Jar Hakashar al Hasheed's from the east.

  • First off and one of the most important factors in quick problem resolution is to use OCM to collect your hardware/software information. This gathers all sorts of info about the environment, what is installed and how it is used. Tickets submitted with an OCM configuration are resolved much faster and are given priority handling.

  • You also need to call when you aren't happy with response times. A lot of time (75% or more) you are assigned to an engineer that is on a different continent. That's all fine and dandy until you need more than one update every 24 hours. Especially when most updates are asking you to perform redundant tasks or to gather information. Just call and ask for an engineer in your time zone.

  • If response time does not improve ask for manager oversight. This causes some lazy bastard to get a swift kick in the ass from his manager. It doesn't always work and if it doesn't then you escalate to Severity 1 if it isn't already. Manager oversight also means that the manager SHOULD oversee the progress on the ticket rather than just ask for an update from his engineer.

  • If the SR your working with isn't exactly pertinent to the group handling it then you will get passed off. This happens A LOT if you have any questions at all or if you don't select department and descriptions perfectly. When this happens make sure to ask the engineer to perform a WARM HANDOFF of the case and to link the 2 SRs to each other. This will save the new engineer all of the trouble of asking you 100 questions that you have already answered to a different Habib.

Besides that everything else should be self explanatory. Sometimes getting angry helps while most of the time it's pointless. Remember, if you're truly pissed then call your sales manager and let them know what's going on. You can also keep escalating the SR until you reach an Oracle Master/Professional that can usually make a system dance within a very short window of time. I've only had to do this once and the support I received from the Oracle Master was nothing short of amazing. So don't get frustrated when it seems like progress has stalled, instead get clever and start manipulating their own system.

Wednesday, July 27, 2011

Moving and/or Renaming Datafiles

I have abandoned ASM for all systems with the exception of RAC environments. Because of this I occasionally have a system analyst name datafiles incorrectly when adding datafiles to a tablespace or even more urgent is when a operating system partition is nearly full.

You cannot simply move the datafiles at the OS level. Instead you have to move/rename them at both the OS and database levels. This is very simple if you're not using ASM. Just follow the 5 easy steps below:

1. The first thing you do is shut down the database:

sqlplus "/ as sysdba"
SQL> shutdown immediate

2. Then you move the files at the OS level:
# mv /u01/oradata/$SID/datafilename.dbf /u02/oradata/$SID/datafilename.dbf

3. Now we mount the database:
sqlplus "/ as sysdba"
startup mount

4. Now we update the data dictionary. Make sure that you use full paths:
alter database rename file '/u01/oradata/$SID/datafilename.dbf ' to '/u02/oradata/$SID/datafilename.dbf';

5. Finally we open the database:
sqlplus "/ as sysdba"
alter database open

That's all there is to it. ASM can do similar things but it's much less needed due to the flexibility and power of ASM. Too bad that comes at the cost of complexity. I prefer to have user managed storage for several reasons. The most important of which is quick and easy database deployment. Simply create a tar archive of your database and you can re-deploy it anywhere very easily.

Tuesday, July 26, 2011

Gaming Keyboard G510 Review

The Logitech Gaming Keyboard G510 is fun to play with. Like the Cooler Master's mouse that I reviewed earlier in the year it has a very sleek and 'cool' design that is eye catching. Standard features include:
  • Dynamic color adjustment of key backlighting
  • A slew of extra macro keys
  • A volume wheel and controls
  • Programmable (software included)
  • LED Screen
  • Windows key lock
  • Wrist rest
  • USB
  • Green

It's important to keep in mind that to use this keyboard to the full extent you have to have a bit of knowledge with macros and modifier keys (alt+key, ctrl+key, ctrl+alt+key, etc). The software is confusing once you start to dig into it. The most popular features such as back lighting and key assignments are straight forward but anything beyond that requires you to spend some time with it. It also requires updates to the software after you download and install it. Shouldn't we just download the latest release in the first place? Logitech...

My favorite feature by far is the LED screen. It is very usable rather than a gimmick. There are custom modules for all of the popular video games and several downloadable modules for use with it. Let's not forget to mention you can write your own. After using the LCD screen on here I feel like I'm missing something if it's not there.

The macro recorder seemed a bit buggy and difficult to get just right. At one point I thought I had it doing exactly what I wanted but when I tried to execute the macro it just simply did not have the intended effect. So instead of turning all of my shiny new keys into complex macros that may or not function exactly how I wanted I just set them all up with their modifiers. The 18 keys on the side are each set with complex modifiers that allow me to bind them in-game. In World of Warcraft it worked great, in any FPS it was terrible. Most fast paced video games don't expect you to be using a lot of modifiers that may not be recognized/saveable.

My biggest complaint by far is the volume control wheel at the top left. The buttons and the wheel all function flawlessly. The actual problem comes into play when you notice that your volume keeps reseting to default. I play with my Windows volume maxed out (100/100) and then I adjust my actual speaker volume dial as needed. This ensures that I'm not missing sounds/noises/voiceovers in my videogames because of Windows. It is rather frustrating to notice halfway through a Call of Duty match that you haven't heard a call out in a while.

The windows key lock was completely worthless for me as I use the windows key a LOT (Try using Windows Key + TAB if you run Vista or 7). Both in and out of game I find myself using the shortcuts enough to completely ignore this feature.
One of the nice things was profile switching on the fly using the M1,M2 and M3 buttons above the macro keys. I would setup colors, macros and custom functions for each profile. M1=MMO, M2=FPS, M3=Standard. This was actually a very welcome feature as I use the keyboard differently for each game type and I assume others do as well.

If you see this go on sale and you want something that looks really nice then by all means grab it. I would strongly recommend to purchase it for specific features rather than just because it's a gaming keyboard. If you don't need macro keys and you could care less about an LCD screen then go get a different keyboard for your "Ohhhhh"s and "Ahhhhh"s.

I'm considering nominating this for the turd in a box award.

EDIT: I would like to add that I returned this keyboard under warranty and Logitech's return procedures are like most, horrendous. I think it took something like 2-3 weeks to get the replacement. They also temporarily charge a credit card until you return the old one. If I remember correctly shipping and handling had to paid too.

Monday, July 25, 2011


Two posts in one day, aren't you lucky! I've got something for you people hitting ORA-4030 errors to try. We all know 4030 is caused by 100 different things and it takes some time as well as probably trial and error to fully resolve them. It can be anything from SGA Sizing, Kernel Parameters, User Limits and today I introduce Inactive Sessions to the list of known causes.

Let's see how much memory INACTIVE sessions are taking:
SQL> select sum(PGA_ALLOC_MEM) from v$process p, v$session s where p.addr = s.paddr and s.status = 'INACTIVE';


These little bastards can really cause a headache after a while. Here's a little information about why it's important that I find and destroy these lazy turds. Each session in the database takes up memory. This is usually no big deal but if you aren't closing out sessions it will eventually become one. Most sessions will do a clean close and there's no issue. The problem for us comes into play when our end users run a report and either exit out of the application or it just simply never finishes. I've always been told that bouncing the report server will kill these spinning reports in the database: WRONG!

First off we have to find them by querying the v$session view:

SQL> set linesize 150

SQL> set pagesize 10000

SQL> select SID, SERIAL#, USERNAME, PROCESS, LOGON_TIME from v$session where status = 'INACTIVE' order by LOGON_TIME;

---------- ---------- ------------------------------ ------------ ---------
       867      40145 MRNAMS                        18617        27-JUN-11
       617      36394 MRNAMS                        29566        25-JUL-11
       714      50687 RMAXWELL                   20605        25-JUL-11
       727      59474 MRNAMS                         5540         25-JUL-11
       922       4096  MRNAMS                         5542         25-JUL-11
      1046     14840 NOC                                   4116         25-JUL-11
       938      58646 MTOMA                          11678        25-JUL-11

MRNAMS is the name of our administrative user. Most of the MRNAMS sessions that are inactive are legitimate. Then we see our end user accounts (RMAXWELL, NOC, MTOMA). These accounts are running reports from our application. While none of these are having problems we will sometimes see one of these sessions stay inactive indefinitely.

We discovered this by correlating these sessions to the OS Process ID. A simple grep of the process shows us this:
ps -ef | grep 11678
oraoas   11678  2725  0 06:39 ?        00:00:00 f60webm webfile=5,33,PID2725

The process f60webm is our forms server. Either MTOMA is running a report or he is logged into the application and has been idle. To resolve this we simply kill the long running sessions manually and then enact a profile resource limit for inactive time on the owning profile. Let's just pretend that MTOMA is a long running session:





There ya go, now you have identified all of the inactive sessions, killed the pertinent ones and enabled a limit within the profile that will kill all FUTURE inactive sessions (that's why we had to manually kill the existing ones, right?) once they have been running for 36 hours.

Weblogic and Portal, Forms & Reports

As a Senior Oracle DBA you're pretty much tasked with anything that involves the word Oracle or that deals with the databases. As an Oracle house the company I work for has been using Oracle Developer's 6i (no, you didn't misread that) for the last 10 years (that either).

So now it's my job to upgrade 50+ systems from 6i to 11g. Suffice to say this is not 'just another upgrade' or patch. This is a genuine bitch to get working and I'm determined to make it work. The biggest problem I'm facing right now is that Oracle no longer uses Apache as their native http server. Instead they use Weblogic (which is also a prerequisite for the new forms and reports. Now on top of that I can only do silent installations with the universal installer using a response file because I do not have desktop access.

UPDATE: Wow, you cannot imagine how difficult this is. Weblogic is installing 32bit JDK/JRE no matter how I install it. I got with Oracle Support and tried installing it using a 64bit Java '.jar' file and it still installs the 32bit versions. Usually this wouldn't be a big deal but Forms and Reports forces you to have matching bit lengths between Web Logic, Forms&Reports, Java and the Operating System (Which means hardware as well to and extent.). Le sigh

Let the fun begin!

Wednesday, July 20, 2011

JDK1.6/OCM Installation Script

I completely forgot about this blog. Eat that non-existent readers! Anyways my Oracle knowledge has grown exponentially since my last update. I'm now much more knowledgeable about kernel parameters (fighting those pesky 4030 bugs) as well as Oracle's configuration manager (OCM). I actually wrote a script that installs JAVA and OCM on the systems. I then called it a patch and asked all of the analyst to apply it to the 50+ databases out there. I don't know about you but I thought it was a damn clever way to hide my laziness.

File 1 (This installs Java Development Kit 1.6.blah.blah then passes the rest of the work off to file2)


# Verify that this script is being executed as root.
if [ -z $LOGNAME ]
   echo "The environmental variable LOGNAME is undefined."
   exit 1

if [ $LOGNAME != "root" ]
   echo "This install script must be executed as root!"
   exit 1

# Create Java directory if it does not exist
if [ ! -d "/usr/java" ]; then
        mkdir /usr/java

# Install Java BIN
cp jdk-6u21-linux-i586.bin /usr/java/jdk-6u21-linux-i586.bin
cd /usr/java
cd -

if [ -e /usr/bin/java ] ; then
        /bin/rm /usr/bin/java

ln -s /usr/java/jdk1.6.0_21/bin/java /usr/bin

# Change ORACLE profile
sed -i 's/JAVA_HOME=\/usr\/local\/java/JAVA_HOME=\/usr\/java\/jdk1.6.0_21/g' /home/oracle/.bash_profile 2>/dev/null
sed -i 's/JAVA_HOME=\/usr\/java/JAVA_HOME=\/usr\/java\/jdk1.6.0_21/g' /home/oracle/.profile 2>/dev/null

# Copy and execute second installation file
cp *.zip /tmp
cp install2 /tmp
cp v4.0.66.sql /tmp
cp passchange.sql /tmp
chown oracle:dba /tmp/*.zip
chown oracle:dba /tmp/install2
su - oracle -c "/tmp/install2"

sqlplus standarduser/hispassword@nams-cs @v4.0.66.sql
#  Provide patch installation feedback and exit.
echo "Patch v4.0.66 has been installed."
echo "Please check all patch logs for errors."
exit 0                           

FILE 2 (This actually installs, collects and uploads OCM and it's reports)

env | grep SHELL | grep ksh > /dev/null 2>&1
if [ "$?" -eq "0" ]; then
        . /home/oracle/.profile
        . /home/oracle/.bash_profile

# Change SYS password to ORACLE
sqlplus "/ as sysdba" @/tmp/passchange.sql

# Unzip OCM installation files
unzip /tmp/

# Install OCM
cd $ORACLE_HOME/ccr/bin
./setupCCR -s ######## US

# Collect and send OCM config
cd $ORACLE_HOME/ccr/admin/scripts
./ collectconfig -s $ORACLE_SID -r SYS 

Thursday, April 28, 2011