|
Data defines the model by dint of genetic programming, producing the best decile table.
|
|
Tukey's Bulging Rule for Straightening Data Bruce Ratner, Ph.D. |
|
"A very effective and simple technique for straightening data is re-expressing the variables, which uses Tukey’s Ladder of Powers and the Bulging Rule. Before presenting the details of the technique, it is worth discussing the importance of straight-line relationships or straight data." - Ratner, B., Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data, CRC Press, Boca Raton, 2006. The following is an excerpt from Chapter 3, pages 39 -41.
3.5.2 Bulging Rule
The Bulging Rule states the following:
- If the data have a shape similar to that shown in the first quadrant, then the data analyst tries re-expressing by going up-ladder for X, Y or both.
- If the data have a shape similar to that shown in the second quadrant, then the data analyst tries re-expressing by going the down-ladder for X, and/or up-ladder for Y.
- If the data have a shape similar to that shown in the third quadrant, then the data analyst tries re-expressing by going down-ladder for X, Y or both.
- If the data have a shape similar to that shown in the fourth quadrant, then the data analyst tries re-expressing by going the up-ladder for X, and/or down-ladder for Y.
Re-expressing is an important, yet fallible part of EDA detective work. While it will typically result in straightening the data, it might result in a deterioration of information. Here is why: re-expression (going down too far) has the potential to squeeze the data so much that its values become indistinguishable, resulting in a loss of information. Expansion (going up too far) can potentially pull apart the data so much that the new far-apart values lie within an artificial range, resulting in a spurious gain of information. ... An excellent real-case illustration follows (pages 41- 50 in the book).
|
For more information about this article, call Bruce Ratner at 516.791.3544 or 1 800 DM STAT-1; or e-mail at br@dmstat1.com. |
Sign-up for a free GenIQ webcast: Click here. |
|
|