On 16/12/2007 the author devised a statistical data prediction algorithm. The results are updated each day and are shown below, with the algorithm applied to the author's web page page-counts and excluding visits of the author to his own web pages.

x=1:17/12/2007.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 21/30~70%

x=1:17/1/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this (half) Month: 10/16~62.5%

x=1:1/2/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 16/29~55.1%

x=1:1/3/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 15/31~48.3%

x=1:1/4/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 17/30~56.6%

x=1:1/5/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 21/31~67.7%

x=1:1/6/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 19/30~63.3%

x=1:1/7/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 15/31~48.3%

x=1:1/8/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 22/31~70.9%

x=1:1/9/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 21/30~70%

x=1:1/10/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 17/31~54.8%

x=1:1/11/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 19/30~63.3%

x=1:1/12/2008.

y=1:Day;y=2:Week;y=3:Month;y=4:Year.

Green: projection correct; Red: projection incorrect.

Daily Prediction Rate for this Month: 8/17~47%

The Algorithm in Pseudocode

The algorithm is simple: if L[n] is a list of data, then denote by sign(n+1,n) the sign of the variance L[n+1]-L[n]. So if the next data to n, L[n+1] is greater than L[n], then sign would be positive (+). If less, then sign would be negative.

Now count the total variance for your entire list length (data period), as in (1) of Web Site Performance and Signal.

Next calculate the Mean:m and the Standard Deviation: sd of your list of data. If a|L[n]-m|>sd then the trend of your data will be sgn(L[n],m). Else, it will be the variance history of the list L. In other words, if the L[n] falls outside the strip m+/-sd, then its trend is the sign of m-L[n]. Otherwise (if it falls inside the strip m+/-sd), then the data trend will be precisely what is dictated by the sign of the total variance of your data: That is, total number of ups minus total number of downs.

The reason we distinguish two cases (|L[n]-mean|>sd vs |L[n]-mean|<sd), has to
do with the fact that random data tends to congregate around the mean m, with maximum
dispersion sd. When the data falls outside this range, then the new data has the
tendency to return and fall within the strip m+/-sd. When it falls *already*
within the strip, its tendency is that of the sign of the total variance.

Here are some results for random data:

Projection for data DL:=[1,2,5,2,4,3,7,5,4,5,3,7,6,10,8,9,5,3,7,4,6,2,4,6,8,9];

(fails at data points where total sign variance is 0 and last datum is very close to mean).

Prediction rate is (number of #(+/-) coincident with trend following)/sizeofdata.

Projection for large set of random data (100 points)

Download a Maple 18 worksheet for this code to test on your data, here.

Results for the webpage Projector:

The experiment is over. The results are as follows:

Average Daily Prediction Rate^{[1]} |
221/367~60.21% |

Average Weekly Prediction Rate^{[2]} |
22/52~42.3% |

Average Monthly Prediction Rate^{[3]} |
9/12~75% |

The author's algorithm predicted the future trend of the author's web pages with the
total success rates shown in the above table for the dates 17/12/2007-17/12/2008. A web
page's statistics are essentially data which vary similarly to a stock's value. These
web pages are the author's stock. Hence, if the author's
algorithm is applied to the Stock Market, it is expected to deliver daily, weekly and
monthly returns which can be calculated approximately based on the table, above. This
table implies that when an investor plays on the Stock Market using the author's
algorithm, *on average*, one can expect the following *long term*
predictions:

Daily Stock Value Prediction (+/-) | ~3/5 |

Weekly Stock Value Prediction (+/-) | ~2/5 |

Monthly Stock Value Prediction (+/-) | ~3/4 |

The last table means, that if an investor plays on the Stock Market every day,
he/she can predict the variation of the stock's value with an *approximate*
success rate of 3/5. If an investor plays every week, he/she can predict the variation
of the stock's value with an *approximate* success rate of 2/5. If an investor
plays every month, he/she can predict the variation of the stock's value with an
*approximate* success rate of 3/4.

Just how does *prediction rate* affect an investor's
decision? In the most obvious way. If the algorithm predicts that a stock moves up, the
investor sells. If the algorithm predicts that a stock drops, the investor buys or
waits.

Now, if you are a Stock Market investor, think why would you be interested in the author's algorithm.

Hint: The *total confidence* TC of such a predictive algorithm is:

TC(pd,pw,pm,py)=(365*pd+52*pw+12*pm+1*py)/430;

The author's algorithm above, then, has a total confidence rate of:

TC(221/367,22/52,9/12,0)~0.58324

Note that since TC > 1/2, the algorithm *makes* money!

- Line x=1 on the graphs, based on n=8 size data (days/week+1).
- Line x=2 on the graphs, based on n=5 size data (weeks/month+1).
- Line x=3 on the graphs, based on n=13 size data (months/year+1).