ChannelDB2

Where I work we are contemplating using MQTs (Materialized Query Tables) for aggregating data on a monthly basis. We have had a debate as to whether this technology would be a good fit or whether we should be creating tables and writing a COBOL program to load the data once a month. Details:

1. We'll be loading millions of rows on a monthly basis.
2. The previous month's data will be deleted when the new month's data is loaded.
3. The data can be static during the month or stated differently, the data only needs to change once a month.
4. There is a debate that the load process, which could use a load utility could be more expensive if MQTs are used. I don't believe this is true, but am open to comments.
5. I have used MQTs on LUW for one project, but not on z/OS. They worked fine, but it seemed to me that we could have just defined tables and written load scripts to load the data in the tables to achieve the same result.

I don't know whether this is true, but it appears to me that DB2's optimizer can choose to retrieve data from an MQT (for instance, aggregated data) when the SQL is written to select summarized data from the base table. If this is true then it sounds like MQTs can increase performance of existing SQL. SQL which did not have to be rewritten to access the MQT directly?

The safe thing to do is to just define tables and load them every month. However, I'm wondering whether there are benefits that I cold be missing out on by using this approach. It seems to me that the load process would be less expensive with an MQT since a program would not be required to access the database to load the data. It seems to me that the refresh script would run and teh data load would occur on the database side.

Thoughts and experiences are appreciated.

Views: 1924

Reply to This

Replies to This Discussion

Hi Brian

From my experience an MQT will have no impact at all on the process of updating the underlying table. You have the choice between a system maintained or user maintained MQT, the difference being that with a system maintained MQT you refresh the contents by using the REFRESH TABLE statement, but with a user maintained MQT it's up to you how you populate the MQT.

You are correct that the beauty of an MQT is that it's completely invisible to the user/application. The optimizer looks for a similarity between the full select specified as the basis for the MQT and the SQL statement at hand and then follows certain rules to determine if the MQT will help. It's really a very clever and elegant approach. There is a temptation to write queries directly against the MQT, but that completely misses the point in my opinion. The application should construct such SQL statements as it needs to fulfill its functional requirements, and appropriate MQTs (with their own indexes) can then be created as needed to improve performance.

The key benefit to an MQT is that all queries are written against a single physical model; a single set of base tables. If pre-aggregation or filtering of data would help performance you introduce them as MQTs without then having to point your application at a different set of tables. New MQTs can be added as time passes and you uncover new opportunities for optimization but the application never has to change to take advantage of them. You can indeed define an MQT on an existing table and see an improvement in existing SQL without doing anything more than rebinding.

If you opt to have MAINTAINED BY USER MQTs then you can populate them using load utility, SQL statements, programs, or anything you like. There's no difference between populating a maintained by user mqt and any other type of table.

Hope this helps!

Kevin
Kevin:

Do you know whether there are limitations to using MQTs on z/OS similar to the following? Please understand that I am asking, not making factual statements. A DBA where I work has expressed concern that z/OS MQTs will experience trouble or fail due to consitions similar to these.

1. Can MQTs aggregate data from base tables wherein the base tables contain indexes?
2. Can MQTs aggregate data from base tables wherein one or more of the base tables are partitioned?

Thanks.
Brian

Absolutely MQTs may be based on tables having indexes and absolutely MQTs can source data from partitioned tables.

I'm a customer, not an IBM developer, but I believe that MQTs are leveraged during the optimization/query re-write phase. DB2 examines the catalog of existing MQT definitions to see if one of them is based upon SQL that is sufficiently similar to the current query to be worth using; if so then the MQT is read rather than the underlying tables. The MQT needs to carry all the columns being requested in the query, and the group by clauses need to be compatible. The underlying physical characteristics of the tablespace really don't enter into it.

MQTs are materialized as tables that are indistinguishable from any user created table. The MQT tablespace itself can be partitioned, it can have indexes. The only difference is that DB2 "knows" that the table representing the MQT is related to the base tables by way of an SQL statement, and can be used as a shortcut to trawling through the base tables if the same level of aggregation is requested.

Hope this helps!

Kevin
Hi Kevin,

We are in some what similar situation. we have a 'transaction data base' and we need a requirement to build an 'operational data mart'. We are planning to have 'stage database' where we can write our complex queries/aggregations etc so we want to bring 'transactional' data into 'staging' database.

In order to 'transfer' the data into 'Staging' we are thinking of using 'MQT'. MQT would be one-on-one for the underlying source (transactional) data table. I have following specific questions:

(1) can an MQT be manually/systematically 'refreshed' while transactions (insert/update/delete) happening on the 'source data table' ? Does MQT lock the 'source table' during the refresh?

(2) can an MQT be located in 'different database' (physically)?

(3) we want to refresh 'MQT' every hour in order to get the latest up-to-date data. Does MQT fit for this type of requirement?

I really appreciate your inputs on this. I am referring to z/OS 9.1.

Thanks, Kiran
Kirin:

(1) MQTs can be refreshed IMMEDIATE or DEFERRED. Refresh IMMEDIATE will update the MQT when the data in the base table is updated. Refresh DEFERRED will update the MQT only when the REFRESH TABLE command is issued. I am not clear on the locking situation with the base tables when the MQTs are refreshed.

(2) An MQT can be located in a different database than the base tables. I am not sure whether you’d need to use Data Federation or Information Integrator to have an MQT located on a different subsystem than the base tables though.

(3) There would be some considerations if you want to refresh the data every hour. Obviously, you could have it REFRESH IMMEDIATE and the data would be up to date all of the time. With REFRESH DEFERRED and if the base tables are large or if the MQT SQL takes a while to complete your MQT tables will be unavailable during that period of time every hour. As I stated before I am not certain how the base table locking behaves while the MQT is being updated. I’m pretty sure the base tables are not exclusively locked, but there’s probably page or row level locking of some kind.

Brian
Thank you Brian. I will explore if 'REFRESH' causes any lock to 'source' table.

Logically it should NOT 'LOCK' source table, when we create 'STAGING' table (to track the changes to BASE table) and when REFRESH happens, it should only read 'STAGING' tables and not the 'BASE' tables.

Kiran

Hi Guyz... please use with UR option in the Create MQT command .... this will ensure the REFRESH always used this isolation level.

The Select part in the CREATE MQT command needs to use UNCOMMITTED READ.

RSS

Featured Downloads

Try BLU Acceleration on Cloud

© 2019   Created by channeldb2.   Powered by

Badges  |  Report an Issue  |  Terms of Service