ChannelDB2

Log Sizing and Best Practices for Logging and Log Monitoring

Folks,

 

Yesterday's discussion on DB2-L about logging raised some valid concerns and led me to address logging in somewhat greater detail.  My thread of the conversation is shared below.  If you find it to be of value, please comment and link to it for your colleagues.

 

--Phil Sevetson

 

**************************************************************************************************************** 

From: XXXXXXX

Sent: Wednesday, August 17, 2011 2:45 PM

To: db2-l@lists.idug.org

Subject: [DB2-L] - RE: Log Archival Nightmare Scenario

 

You guys have me worrying about my own situation now.  We produce about 1300 tracks of log data each minute.  We have 6 active logs defined and they switch about every 3 minutes, so that means we only have about 18 minutes worth on active logs.  We archive to DASD and let HSM migrate those twice a day to virtual tape.  Our system is not really that busy, so you guys with the busy systems and hours worth of data on active logs must have huge active log datasets, or huge numbers of them.  Is that the case?

 

Thanks,

 

XXXXXXX

 

**************************************************************************************************************** 

 

XXXXXXX,

 

Your description is of active logs about 250 cylinders in size.  I’m used to seeing logs that are more like 500-2000 cylinders, with most of them at the top end of that range.  A 3-minute log switch means you’re kicking out a really large number of logs, too.  That’s a pretty busy system in any case; you’re writing 2500 cylinders in a half hour or about 120,000 cylinders per day (unless there are quiet periods and you were describing peak load, of course). 

 

I’d recommend moving to at least twelve logs of 2500 cylinders each (times two for the duplexed logs!), in your situation; in that circumstance, you have about six hours of active log, which gives you plenty of time to react to alerts about your logs filling up.  You should also, if possible, set an alert to notify you as soon as two or more logs are waiting for the archive process, as that indicates that archiving is getting behind logging.

 

If management calls on you to justify asking for larger logs, here are the kinds of questions you need to know the answers to.

 

Do your customers do updates of large amounts of data, using dynamic SQL?  Dynamic updating is by nature less predictable than static code, and customers will sometimes throw large updates into production with no warning (this has actually happened to me and indirectly caused a major outage).

 

How long does your longest transaction run?  Your average transaction?  Is a ROLLBACK or a bad SQLCODE near the end of your longest transaction going to force you to use archive logs?  (Archive logs are not addressable directly and hence are more expensive to read from than active logs)

 

How many rollbacks do you experience per day?  How many of them currently call for archive tapes and how many do they call for?  How long do rollbacks take (average and longest)?  Is this an acceptable level of disruption to your business?  Are there actual costs associated with taking that long and how much would faster rollbacks help your costs?

 

Do you allow dynamic SQL in production?  What is the largest possible update which you could encounter using dynamic SQL?  Again, what are the costs?  Remember that dynamic SQL can cause lock escalation unless you have explicitly disallowed escalation on key resources.

 

How tightly are your logs monitored – if your logs stopped archiving due to a system hiccup or were overwhelmed by a heavy update statement, how long would it take someone to notice and do they know what to do (what would they do – SET LOG SUSPEND, ARCHIVE LOG, QUIESCE SYSTEM, STOP DB2)?  Remember that if all active logs are full, DB2 updates stop cold until a log is archived.

 

What automatic logging, update, and unit-of-work-length alerts exist and what are your procedures for handling archiving storms or failures?

 

You DON’T HAVE TO HAVE GOOD ANSWERS TO ALL THESE QUESTIONS RIGHT AWAY.  If your system runs smoothly, investigate this stuff _quietly_ and don’t immediately panic – management hates the sound of panic, especially if they can’t smell the smoke.

 

That’s all I have off the top of my head.  Hope it helps.

 

--Phil Sevetson

 

Views: 748

Tags: Archive, Best-Practices, DB2, DB2-z/OS, Log, Useful-Questions

Comment

You need to be a member of ChannelDB2 to add comments!

Join ChannelDB2

Try BLU Acceleration on Cloud

© 2014   Created by channeldb2.

Badges  |  Report an Issue  |  Terms of Service