Relational Database.Index Design and the Optimizers.DB2, Oracle, SQL Server by Ilia Stanov

Relational Database.Index Design and the Optimizers.DB2, Oracle, SQL Server

180

Chapter 8 Indexing for Table Joins

When all the high-water marks have been refreshed, the average FETCH may require 1000 sequential touches to the ﬁrst index but only one touch, normally random, to the second index. Thus, the QUBE for a transaction with 20 FETCHes would be: 20 × 10 ms + 20,000 × 0.01 ms + 20 × 0.1 ms = 0.4 s Cost of Denormalization

The biggest performance concern is normally the I/O time required to update the redundant data added to the table and to one of its indexes. With downward denormalization, a large number of index rows may have to be moved, which may make a simple UPDATE very slow. Upward denormalization is not as likely to cause I/O bursts due to a single update, but many INSERTS, UPDATEs, and DELETEs may cause a few extra disk I/Os to the parent table and to one of its indexes. In extreme cases, say more than 10 INSERTs or UPDATEs per second, the disk drive load created by these I/Os can be an issue. Nested-Loop Join and MS/HJ Versus Denormalization

It is understandable that many database specialists are reluctant to add redundant columns to operational tables. Denormalization is not only a trade-off between retrieval speed and update speed—it is also, to some extent, a trade-off between performance and data integrity, even if triggers are used to maintain the redundant data. But then, when nested loop causes too many random reads and MS/HJ consumes too much CPU time, denormalization may appear to be the only option available. Nevertheless, before taking this drastic approach, it is obviously of great importance to ensure that everything that could be done to avoid it, has been done. This means being absolutely sure that the best fat indexes for NLJ or MS/HJ have been considered and making any troublesome indexes resident in memory; the purchase of additional memory to ensure more resident indexes, or of new disks with fast sequential read, might push the break-even point with regard to the need to denormalize just that little bit further. This is without doubt a difficult balancing act. We do not want to give the impression that non-BJQ is always easy to fix with superfast sequential scans. The figures might suggest this in our case study, but tables in real life are often much larger, often containing over 100 million rows and, with high transaction rates, CPU time is still a very important consideration.

Unconscious Table Design From the performance point of view, it is more difﬁcult to understand why so many databases have tables that have a 1:1 or 1:C (C = conditional; 0 or 1) relationship, as shown in Figure 8.23. Why create four tables instead of one CUST table? Flexibility is not an issue when the relationship can never become 1:M. In this example a customer