ChannelDB2

On Log Sizing and Placement with DB2 for z/OS

Today, on DB2-L, someone once again asked the classic questions about how many logs and how big they should be.  I've written ad-hoc responses to this on DB2-L several times, and I think it's time I shared it in permanent form with the wider community.

 

-----Original Message-----
From: AAAAAAA
Sent: Tuesday, August 16, 2011 4:02 PM
To: 'DB2-L@lists.idug.org'
Subject: [DB2-L] - Log Archival

 

We recently experienced a problem with our tape library and came very close to running out of active log space.  Since we are V9 we could not add active log space dynamically.  This has sparked a discussion about increasing the size and number of active logs versus archiving logs to disk rather than tape.  Having a large active logs (24-48 hours) seems like an efficient approach.  Archiving to disk avoids tape altogether, but at a higher cost.  What are all of you doing?

 

**************************************************************************************************************

 

AAAAAAA,

 

Put your active logs up to several hours’ worth at least, and to five or six active logs, and set your monitor to scream at you when there are two active logs waiting for archive.  That’ll clue you that there’s either (1) some kind of extremely heavy updating going on or (2) the archiving process is busted; and it’ll give you some time to respond and cancel threads or Stop DB2 if needed.  If you’re a really high-updating shop, like millions of logrecs per hour, you could even go to a dozen active logs.  Good volume estimates are needed for this, and careful/restricted use of SQL operations that can generate heavy logging. (ALTER TABLE/ALTER PARTITION/ROTATE FIRST TO LAST is such an operation, for those who use time-partitioned tables.)

 

Don’t, please don’t, run your archive logs directly to tape.  (VTL or DASD with DFSMS would be good choices)  You only need to have your system crash _once_, then (at restart) start rolling back archived units of recovery on _one_ physical tape, reading backwards, to put everyone in *very hot water* with IT management.  The problem with the physical tape units like the old 3480 is that they *don’t have backward read ability* and they mimic it by advancing to the last record... then rewinding and advancing to the next to last record... and so forth.

 

I’ve been there.  It’s ugly; the system won’t release key application resources (whatever was in the monster unit of work you’re rolling back in this scenario) until it’s done rolling them back, and your processing can’t start up against those resources, so _all your customers_ for the rolling-back application are now waiting for your physical tape to do backward reads.  THIS CAN TAKE HOURS. 

 

The crash-recovery workaround with archive on tape, if you don’t have the option of archiving to disk, is to take the archive logs you’ll need (you can estimate based on how long the Long Unit of Recovery was running if-that's-your-problem, and comparing to the list of archive timestamps and tapes in the BSDS), and IEBGENER/copy them (or DFSORT/copy them) from disk to tape, preserving blocksizes etc.  _THEN_ bring up your DB2 system and it’ll roll back against disk files, saving your team hours of ass-chewing by senior executives.  If you don't have a positive time when your trouble started, then go conservatively and move as many archives as you possibly can to DASD, working in reverse chronological order.  Or use some "we know it wasn't a problem THIS long ago" estimation.  Again, this is a _workaround_ -- much better to be able to HRECALL the datasets or bring back a VTL dataset into the pool.

 

The best scenario for this is in the case of DFSMS or VTL, you physically recall your tape datasets to the active volumes (Make sure the SMS pool has plenty of free space to do this!!!), and backwards read of a dataset can be done in minutes without any emergency copies.  This can mean the difference between losing your job (if the recovery takes hours) and keeping it, if the rollback is high-visibility/high-impact.

 

Hope this helps.

 

--Phil Sevetson

Views: 418

Tags: Archive, Backward-Recovery, DB2, DB2-z/OS, Logging, Recovery

Comment

You need to be a member of ChannelDB2 to add comments!

Join ChannelDB2

Try BLU Acceleration on Cloud

© 2014   Created by channeldb2.

Badges  |  Report an Issue  |  Terms of Service