ChannelDB2

A Fast Way to Get Started with Cache Tables (Distributed Caching)

I often get asked if it's easy to set up IBM cache tables and whether they can be set up with scripts instead of a GUI  The answer I give is yes to both questions.  Cache tables, a form of distributed caching, are very easy to set up and you can do it with scripts. The challenges are that (1) IBM's documentation could be better and (2) a good sample is needed. I'm hoping a few pictures, a little text, and a small sample are all that's needed to improve the situation :)


So, What's a Cache Table?


I find this easier to answer by taking a quick look at data replication. You'll see why.


The most basic replication scenario is where you build and maintain a copy of one system's data on another system. The copy is kept current by (1) capturing changed data from a log and (2) applying the changes to a target. People often use this scenario when they want to move query workloads, usually reporting, to another system and eliminate the impact (CPU usage, locking, whatever) on critical production work.

 


Inevitably, some soon find they get a requirement to allow changes to the replicated data and then have those changes reflected back in the original source. One solution is to establish two way replication between the source and other systems. You'll see these scenarios described with terms like bidirectional, peer-to-peer, and update-anywhere replication.


However, what if you expect only infrequent changes from a secondary system? Or, maybe you prefer to keep updates centralized on the original source and avoid the conflict detection and resolution that may be required with multi-directional replication. That's where cache tables come in.


A cache table is a type of replicated table that satisfies queries using replicated data, but redirects inserts, updates, and deletes back to the original source. The 'redirection' is transparent to applications.


What Makes This Work?


Cache tables combine three database functions:

 

- Nicknames (part of IBM's data federation/virtualization technology)

- Materialized query tables (MQTs)

- Unidirectional (one-way) data replication

 

A nickname is a 'virtual' table in one database that points to a physical table in another database. SQL can reference the nickname anywhere a table can be referenced. For example, you can use them in inserts, updates and deletes where you specify which table to make changes to. Under the covers, the database redirects the SQL request to the physical table in the other database.

 

I have a harder time defining MQTs :) However, for the needs of this post, we'll limit the definition to - an MQT is a target table for data replication and it is associated with a nickname. Note that, in the scenario, you issue SQL against the nickname, not the MQT.

 

Replication provides the initial load of data for the MQT and keeps it current with changed data. When a query is issued against the nickname, it is satisfied using the contents of the MQT. When an insert, update or delete is issued against the nickname, the statement is routed to the source system. Replication will copy the change back to the MQT. This allows queries to be offloaded from the source while centralizing changes.


 

That's it.


A Good Sample

 

You can find a good sample in the IBM developerWorks exchange. I call it 'good' because it's simple and small :)

 

It uses a DB2 sample database and another that you create. The second will be on the same system as your sample database (same system for ease of learning). It has one script to create a nickname and MQT. It has another to set up SQL Replication between your databases. It also shows how to start the replication programs and add more cache tables.

 

To see it work, connect to the second, 'cache', database after starting the replication programs. Issue an update statement against the nickname, changing the value of one column. Once the update is complete, issue "select *" against the nickname. In all likelyhood, you will not see your updated value in the result set. This is because there will be likely be replication latency between the source table and target (MQT cache) table when you use SQL Replication. If you need very low (e.g., subsecond) latency between the source and target tables, use Q Replication... But, that's a future topic...


 

--

Views: 963

Tags: cache tables, data replication, data virtualization, distributed caching, eii, fast start, federation

Comment

You need to be a member of ChannelDB2 to add comments!

Join ChannelDB2

Try BLU Acceleration on Cloud

© 2014   Created by channeldb2.

Badges  |  Report an Issue  |  Terms of Service