DB2 REORG - from the view of application developer
DB2 REORG is not a very common topic for a classic application developer. But if you are developing your application that runs on top of DB2, and unfortunately you have to write your own database migration scripts, eventually you will encounter the situation in which some data tables of your application need a REORG. Now I want to introduce some simple common sense about DB2 REORG that can help developing database migration scripts.
What is REORG in DB2
To keep it simple: REORG is a database operation that is executed in DB2 to reorganize the table spaces or index spaces. The purpose of the reorganization is data clustering. This operation reconstructs the data in the table spaces or index spaces, so that the data is unfragmented and physically continuous. The benefit we gain is that the performance of database stays tuned.
There are two types of REORG: Classic REORG and In Place REORG. Basically classic REORG creates a temporary copy of the table / index, then replaces the original one with the organized copy. During the classic REORG we do not have access to the table. On the other side, the in place REORG allows full access to the table, because the operation incrementally reorganizes the table.
When to run a REORG in DB2
When REORGCHK indicates the need of REORG
Normally after a large amount of INSERT, UPDATE, DELETE activities on data tables, we generally need a REORG. Why? First of all a table space in DB2 is pretty much like the file system on your hard disk. If we insert many records, there might not be enough free space to keep the records in the clustered sequence. If we delete many records, the table still occupies the disk space of the deleted records. Then the table space gets more and more fragmented. Over time such small fragments will affect the performance of the database, unless a REORG is executed. But this is normally the job of DBAs who are constantly monitoring the status of the database.
You can use the following command in DB2 to find out which table needs a REORG. In the REORGCHECK DB2 checks whether the calculated results are still within the set bounds of corresponding formulas. For more information you can read the DB2 manual for REORGCHK.
db2"reorgchk current statistics on table all"
When REORG_PENDING shows up
Nevertheless, there are certain operations in DB2 which lead to the REORG_PENDING state on the changed tables. REORG_PENDING state indicates that we either must or should reorganize the data table, because DB2 will limit the access to the tables in that state. Dropping a column in table for example is one of the REORG_PENDING operations. Since we are application developers and we migrate our databases using Flyway scripts, there will be plenty of statements that can trigger the REORG_PENDING. Then we must add a REORG after such statements. In this case, it is our (in my opinion, developers) responsibility to add a REORG statement into the scripts, when we check the scripts and find out that they lead to REORG_PENDING state.
There is a simple SQL query that can tell us whether a table is in REORG_PENDING state.
If you just drop a column in the table ‘MYTABLE’, then run the SQL query from above, you will see the following result. The value of column REORG_PENDING is ‘Y’. It means that you should be running a REORG on this table.
After a table space reorganization the fragments are moved around and clustered, however the database still utilizes the old statistic to query the data. So it is recommended to run a RUNSTATS on the table. This command updates the statistic information. Besides it makes sure that the DB2 optimizer takes advantage of the refreshed statistic information.
As I mentioned above a developer does not need to constantly supervise the Database (DB2 in my case), because it is not his job to maintain the health of the Database in terms of data reorganization. But a developer who is using Flyway should know whether his migration scripts are going to cause a ‘REORG’ pending on some data tables. Of course to make sure your scripts are always REORG free or to identify the REORG needing scripts will cost extra time and more efforts in the development. In my experience the cost of not taking care of such issue in the development stage is higher. Obviously it is more difficult to fix something on high level testing environment than on the local development environment.