About
quantitative methods to measure accuracy of
predictions.
|
|
|
In evaluating the accuracy of a
prediction there are two main
"metrics". The DS (Directional
Simmetry) and the RMSE (Root Mean Squared Error).
Both metrics are good in describing the better
cases and the worst ones (where the prediction is
very close to the target and when the prediction
is far from it), but neither is appropiate in
dealing with intermediate cases, which are by far
the most usual ones. In the following graphics is
shown in black the target curve, or the actual
evolution of the ups and downs of the market
index direction. In red and yellow are two
"toy" predictions, built in order to
show how general accepted metrics fail to
describe accuracy of forecasts.
|
|
|
|
The DS is just the ratio between
the right hits over the total number of movements
expressed in percentage. It is commonly accepted
that a prediction with DS over 50% has to be good
and under 50%, bad and the higher the ratio the
best the prediction. However, as depicted in the
first figure, if the target evolution of ups and
downs is the black curve, the red prediction
seems, by naked eye appreciation, far better than
the yellow one. The DS shows here (picture above)
its failure to evaluate a good prediction: DS for
the red curve is 33.33% although it captures very
well the main features of the target. It shows
first an erratic behavior, then an ascending
trend, the turning point, the downturn, and
erratic again. The DS for the yellow curve, which
fails to describe any characteristic of the
target, is 42.85%. One could easily argue that
this is an example built on purpose, and it is.
Anyway, always, while trying to evaluate the
accuracy using the DS in any
"intermediate" case, the DS will be
polluted in more or less degree with this defect. |
| |
|
|
The same occurs for the RMSE. The
RMSE is the summation of the squares of the
differences between the target value and the
prediction, normalized by the square root of N,
the number of data points. A good prediction has
a RMSE as low as possible. However, as depicted
in the second figure (above), if the target
evolution of ups and downs is again the black
curve, the red prediction seems, by naked eye
appreciation, far better than the yellow one. The
RMSE shows here its failure to evaluate a good
prediction: RMSE for the red curve is 2.0
although it captures very well the main features
of the target. It shows good agreement with the
target curve (in black), in locating the turning
points and trends. The RMSE for the yellow curve,
which fails to describe any characteristic of the
target, is 1.9518. From this analysis it can be
inferred that neither the DS nor the RMSE are
good to evaluate which prediction is better in
intermediate cases in which the target value is
not very close nor very far from the solution
under scrutiny. In those cases the better way is
to evaluate the euclidean distance from the
turning points of the target to the turning
points of the solution, but unfortunately
determining which points are turning points is
something subjective, so this metric is not very
well defined or at least it will depend on some
parameters while both RMSE and DS are parameter
free. Finally it results that a good way to
evaluate good predictions is comparing patterns
by naked eye.
When trying to compare graphics of forecasts in order to evaluate
their accuracy in terms of a number, it must be kept in
mind the main purpose of the forecast itself. The predictions
shown in this site try to forecast the directions of the market,
with special attention in locating the turning points. In this sense
we do not try to replicate exactly the pattern of daily ups and downs, but
the general tendency, upward, downward or with no direction defined.
If, for example, the market does not show a prefered direction, with a
pattern as up,down,up,down,up,down,up,down,up,down,... and the prediction
is down,up,down,up,down,up,down,up,down,up,..., the
prediction has a DS equal to zero, but our specific task is fully accomplished: the predictor showed
the general behavior of the market, which was very well described. This is the reason why the exact order details of the prediction
doesn't matter, and in order to compare relative accuracy on market direction after 20 trading days we decided to use the DS, regardless of the order.
That means that we will compute a number usefull to compare different predictions, given by:
{Min(ForecastedUps,ActualUps)+Min(ForecastedDowns,ActualDowns)} / 20,
expresed in percentage. This measure is the same as DS, disregarding the order of appearance of ups and downs, and represents how far is the prediction from the market after 20 days.
All this analisys is also carried out for the intra day forecasts.
|