Wednesday, March 19, 2014

Improve SQL grouping performance with Db2 EVIs

In many SQL queries the "GROUP BY" clause is used to aggregate some values. For example a manager wants to see sales for a specific year grouped by region. The SQL for this query looks like this:

SELECT REGION, SUM(AMOUNT) FROM INVOICES WHERE YEAR=2013 GROUP BY REGION

The Db2  optimizer implements the query with a table scan and a temporary hash table.


The runtime of the query is on our 4 Core box with SSDs and 64 GB main memory 620 ms, which is pretty slow for such a simple query. Fortunately to improve the performance we can create an EVI (Encoded Vector Index) with precalculated aggregate values in the symbol table. This is a new feature in V7R1. The Command to create this index is

CREATE ENCODED VECTOR INDEX INVOICESEVI ON INVOICES (YEAR  ASC, REGION ASC) INCLUDE(SUM(AMOUNT)) ;

After creation of the index the query optimizer decides to use the EVI and re implement the access plan with  an EVI Symbol Table Probe and a temporary hash table. This is much more efficient then the table scan before. For example while the table scans returns over 200.000 rows for which a Hash table has to be built the EVI Symbol Table Probe returns only 34 rows. The query with the EVI runs on our machine in under 10 ms. This is quite an improvement compared to the 620 ms without an EVI.
So if you use grouping in your query you should have a look at the new include aggregates functionality in EVI's in IBM i V7R1.

Reference:

Index grouping implementation in the IBM i Infocenter
Encoded vector indexes

No comments:

Post a Comment

ad