About quantitative methods to measure accuracy of predictions.

    In evaluating the accuracy of a prediction there are two main "metrics". The DS (Directional Simmetry) and the RMSE (Root Mean Squared Error). Both metrics are good in describing the better cases and the worst ones (where the prediction is very close to the target and when the prediction is far from it), but neither is appropiate in dealing with intermediate cases, which are by far the most usual ones.

In the following graphics is shown in black the target curve, or the actual evolution of the ups and downs of the market index direction. In red and yellow are two "toy" predictions, built in order to show how general accepted metrics fail to describe accuracy of forecasts.

Directional Simmetry

     
     

 

 
    The DS is just the ratio between the right hits over the total number of movements expressed in percentage. It is commonly accepted that a prediction with DS over 50% has to be good and under 50%, bad and the higher the ratio the best the prediction. However, as depicted in the first figure, if the target evolution of ups and downs is the black curve, the red prediction seems, by naked eye appreciation, far better than the yellow one. The DS shows here (picture above) its failure to evaluate a good prediction: DS for the red curve is 33.33% although it captures very well the main features of the target. It shows first an erratic behavior, then an ascending trend, the turning point, the downturn, and erratic again. The DS for the yellow curve, which fails to describe any characteristic of the target, is 42.85%. One could easily argue that this is an example built on purpose, and it is. Anyway, always, while trying to evaluate the accuracy using the DS in any "intermediate" case, the DS will be polluted in more or less degree with this defect.

Root Mean Square Error

     
     
      The same occurs for the RMSE. The RMSE is the summation of the squares of the differences between the target value and the prediction, normalized by the square root of N, the number of data points. A good prediction has a RMSE as low as possible. However, as depicted in the second figure (above), if the target evolution of ups and downs is again the black curve, the red prediction seems, by naked eye appreciation, far better than the yellow one. The RMSE shows here its failure to evaluate a good prediction: RMSE for the red curve is 2.0 although it captures very well the main features of the target. It shows good agreement with the target curve (in black), in locating the turning points and trends. The RMSE for the yellow curve, which fails to describe any characteristic of the target, is 1.9518.

From this analysis it can be inferred that neither the DS nor the RMSE are good to evaluate which prediction is better in intermediate cases in which the target value is not very close nor very far from the solution under scrutiny. In those cases the better way is to evaluate the euclidean distance from the turning points of the target to the turning points of the solution, but unfortunately determining which points are turning points is something subjective, so this metric is not very well defined or at least it will depend on some parameters while both RMSE and DS are parameter free. Finally it results that a good way to evaluate good predictions is comparing patterns by naked eye. When trying to compare graphics of forecasts in order to evaluate their accuracy in terms of a number, it must be kept in mind the main purpose of the forecast itself. The predictions shown in this site try to forecast the directions of the market, with special attention in locating the turning points. In this sense we do not try to replicate exactly the pattern of daily ups and downs, but the general tendency, upward, downward or with no direction defined. If, for example, the market does not show a prefered direction, with a pattern as up,down,up,down,up,down,up,down,up,down,... and the prediction is down,up,down,up,down,up,down,up,down,up,..., the prediction has a DS equal to zero, but our specific task is fully accomplished: the predictor showed the general behavior of the market, which was very well described. This is the reason why the exact order details of the prediction doesn't matter, and in order to compare relative accuracy on market direction after 20 trading days we decided to use the DS, regardless of the order. That means that we will compute a number usefull to compare different predictions, given by: {Min(ForecastedUps,ActualUps)+Min(ForecastedDowns,ActualDowns)} / 20, expresed in percentage. This measure is the same as DS, disregarding the order of appearance of ups and downs, and represents how far is the prediction from the market after 20 days. All this analisys is also carried out for the intra day forecasts.

 

      ... back to
Market Forecasting Page ...