Data defines the model by dint of genetic programming, producing the best decile table.


Tukey's Bulging Rule for Straightening Data
Bruce Ratner, Ph.D.

"A very effective and simple technique for straightening data is re-expressing the variables, which uses Tukey’s Ladder of Powers and the Bulging Rule. Before presenting the details of the technique, it is worth discussing the importance of straight-line relationships or straight data."
- Ratner, B., Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data, CRC Press, Boca Raton, 2006. The following is an excerpt from Chapter 3, pages 39 -41.

3.5.2 Bulging Rule

The Bulging Rule states the following:

    1. If the data have a shape similar to that shown in the first quadrant, then the data analyst tries re-expressing by going up-ladder for X, Y or both.
    2. If the data have a shape similar to that shown in the second quadrant, then the data analyst tries re-expressing by going the down-ladder for X, and/or up-ladder for Y.
    3. If the data have a shape similar to that shown in the third quadrant, then the data analyst tries re-expressing by going down-ladder for X, Y or both.
    4. If the data have a shape similar to that shown in the fourth quadrant, then the data analyst tries re-expressing by going the up-ladder for X, and/or down-ladder for Y.
Re-expressing is an important, yet fallible part of EDA detective work. While it will typically result in straightening the data, it might result in a deterioration of information. Here is why: re-expression (going down too far) has the potential to squeeze the data so much that its values become indistinguishable, resulting in a loss of information. Expansion (going up too far) can potentially pull apart the data so much that the new far-apart values lie within an artificial range, resulting in a spurious gain of information. ... An excellent real-case illustration follows (pages 41- 50 in the book).

For more information about this article, call Bruce Ratner at 516.791.3544 or 1 800 DM STAT-1; or e-mail at br@dmstat1.com.
Sign-up for a free GenIQ webcast: Click here.