Noise bottleneck, by nassim taleb by Mario Alfredo Grajales Leal

The Noise Bottleneck or How Noise Explodes Faster than Data (very Brief Note for the Signal Noise Section in Antifragile) Nassim N Taleb August 25, 2013

The paradox is that increase in sample size magnifies the role of noise (or luck).

Keywords: Big Data, Fooled by Randomness, Noise/Signal

PRELIMINARY DRAFT

Introduction It has always been absolutely silly to be exposed the news. Things are worse today thanks to the web. We are getting more information, but with constant "consciouness", "desk space", or "visibility". Google News, Bloomberg News, etc. have space for, say, <100 items at any point in time. But there are millions of events every day. As the world is more connected, with the global dominating over the local, the number of sources of news is multiplying. But your consciousness remains limited. So we are experiencing a winner-take-all effect in information: like a large movie theatre with a small door. Likewise we are getting more data. The size of the door is remaining constant, the theater is getting larger. The winner-take-all effects in information space corresponds to more noise, less signal. In other words the spurious dominates. Similarity with the Fooled by Randomness Bottleneck. This is similar to my idea that the more spurious returns dominate finance as the number of players get large, and swap the more solid ones. Start with the idea (see Taleb 2001), that as a population of operators in a profession marked by a high degrees of randomness increases, the number of stellar results, and stellar for completely random reasons, gets larger. The “spurious tail” is therefore the number of persons who rise to the top for no reasons other than mere luck, with subsequent rationalizations, analyses, explanations, and attributions. The performance in the “spurious tail” is only a matter of number of participants, the base population of those who tried. Assuming a symmetric market, if one has for base population 1 million persons with zero skills and ability to predict starting Year 1, there should be 500K spurious winners Year 2, 250K Year 3, 125K Year 4, etc. One can easily see that the size of the winning population in, say, Year 10 depends on the size of the base population Year 1; doubling the initial population would double the straight winners. Injecting skills in the form of better-than-random abilities to predict does not change the story by much. (Note that this idea has been severely plagiarized by someone, about which a bit more soon). Because of scalability, the top, say 300, managers get the bulk of the allocations, with the lion's share going to the top 30. So it is obvious that the winner-take-all effect causes distortions: say there are m initial participants and the "top" k managers selected, the result will be

k m

managers in play. As the base population gets larger, that is, N increases

linearly, we push into the tail probabilities. Here read skills for information, noise for spurious performance, and translate the problem into information and news. The paradox: This is quite paradoxical as we are accustomed to the opposite effect, namely that a large increases in sample size reduces the effect of sampling error; here the narrowness of M puts sampling error on steroids.

Noise Bottleneck.nb

Here read skills for information, noise for spurious performance, and translate the problem into information and news. The paradox: This is quite paradoxical as we are accustomed to the opposite effect, namely that a large increases in sample size reduces the effect of sampling error; here the narrowness of M puts sampling error on steroids.

Derivations j

Let Z ª Izi M1< j<m, 1§i<n be a (n × m) sized population of variations, m population series and n data points per distribution, with i, j œ N; assume “noise” or scale of the distribution s œ R+ , signal m ¥0 . Clearly s can accommodate distributions with infinite variance, but we need the expectation to be finite. Assume i.i.d. for a start.

Cross Sectional (n = 1) Special case n = 1: we are just considering news/data without historical attributes. Let F¬ be the generalized inverse distribution, or the quantile, F¬ HwL ã inf 8t œ R : F HtL ¥ w<, for all nondecreasing distribution functions FHxL ª PHX < xL. For distributions without compact support, w œ (0,1); otherwise w œ @0, 1D. In the case of continuous and increasing distributions, we can write F-1 instead. The signal is in the expectaion, so E HzL is the signal, and s the scale of the distribution determines the noise (which for a Gaussian corresponds to the standard deviation). Assume for now that all noises are drawn from the same distribution. Assume constant probability the "threshold", z=

k m

, where k is the size of the window of the arrival. Since we assume

that k is constant, it matters greatly that the quantile covered shrinks with m.

Gaussian Noise When we set z as the reachable noise. The quantile becomes: F-1 HwL =

2 s erfc-1 H2 wL + m

Where erfc-1 is the inverse complementary error function. Of more concern is the survival function, F ª F HxL ª PHX > xL, and its inverse F-1 F-1 s,m HzL = -

2 s erfc-1 2

k m

(1)

Note that s (noise) is multiplicative, when m (signal) is additive. As information increases, z becomes smaller, and F-1 moves away in standard deviations. But nothing yet by comparison with Fat tails.

Noise Bottleneck.nb

F¬

20 1

2 3

10 4

0.02

0.04

0.06

0.08

0.10

Figure 1: Gaussian, s={1,2,3,4}

Fat Tailed Noise Now we take a Student T Distribution as a substitute to the Gaussian. a+1 2

a a+

Hx-mL2

f HxL ª

(2)

a s BI , M Where we can get the inverse survival function.

g-1 s,m HzL = m +

a s sgnH1 - 2 zL

Where I is the generalized regularized incomplete Beta function IHz0 ,z1 L Ha, bL = Beta

function

Bz Ha, bL ‡ Ÿ0 ta-1 H1 - tLb-1 dt. z

BHa, bL ‡ GHaL GHbL ê GHa + bL ‡ Ÿ0 ta-1 H1 - tLb-1 dt. 1

-1

-1 IH1,H2 z-1L sgnH1-2 zLL I , M

B Ha, bL

the

BHz0 ,z1 L Ha,bL

Euler

BHa,bL

Beta

(3)

, and Bz Ha, bL the incomplete function

Noise Bottleneck.nb

g¬ 10 000

8000 1 2

6000

4000

2000 2. µ 10-7

4. µ 10-7

6. µ 10-7

8. µ 10-7

Figure 2: Power Law, s={1,2,3,4}

As we can see in Figure 2, the explosion in the tails of noise, and noise only. Part 2 of the discussion to come soon.

z 1. µ 10-6