An attempt to rank systems

JulesEllis · March 4, 2006, 1:24am

An attempt to rank systems on return, risk and slippage-sensitivity

On several places in this forum it has been suggested that one should not look only to the annualized return of a system, but also to their risk and slippage. But how should one do that? Here, I will discuss some ideas to do that on basis of the statistics that C2 provides. There are a some limitations that I must point out first:

1.I do this in my spare time, so there are no guarantees with respect to validity and accuracy.

2.Im interested in stocks only. You can do the same things for other systems yourself.

3.I worked only one evening on this. Feel free to suggest improvements or do them yourself.

The problem with choosing systems is that they vary on multiple dimensions, and that it is generally impossible to rank multidimensional data objectively. A system that is the best on one dimension (e.g. profit) is often the worst on another dimension (e.g. risk). You can of course assign weights to these dimensions and compute some kind of composite, but anyone else can prefer other weights.

So I borrowed some ideas from Data Envelopement Analysis (DEA). This kind of analysis is often used to compare companies of the same kind on efficiency. (However, what I present below is not a full DEA). Applied to the present situation the idea is that there should be many rank orders, depending on the weight that one assigns to the various dimensions.

For example, suppose that one has for each system two statistics, say profit and safety (Ill use this word as the opposite of risk, suggest a better word if you want). These two dimensions tend to be negatively correlated: A system that is high on profit will usually be low on safety. However, there are also cases where one system dominates the other completely, i.e. system A has higher profits AND higher safety than system B. Obviously, all other things being equal, one would then prefer system A.

The method is now as follows: Make a scatter plot of the two statistics, where profit is on the vertical axis, safety on the horizontal axis, and each system is represented by a point in the plot. Ideally, a system should lie in the right and upper part of the figure, but as a result of the negative relation this area will be empty. Now find all points that are not completely dominated by any other point. These points can be connected by a (curved) decreasing line such that all points lie on the left-below side of this line.

All systems that are not on this line are completely dominated (i.e. outperformed on both dimensions) by some system on the line. Therefore, you confine the choice to the systems that lie on the line (or are very close to it, since the statistics have of course some unreliability). The systems on the line are all best on some weighted composite of the dimensions, and do not completely dominate each other. That is, if systems A and B are both on the line, then A must have more profit but less safety then system B, or conversely. So the further choice between these systems is a matter of taste and other features that one wishes to consider.

I have done this for stock systems on C2, with the restriction that they have made more than 50 trades and a positive return. I considered these dimensions. (Better choices are possible here, but I used these things because they are easily computed from the C2 overview pages).

Annualized return: I use this as a measure of profit. Elsewhere Ive argued that this statistics must have a bias against young systems, but nevertheless I use it here, since any correction that I create would probably not be trusted by many of you

Sharpe: I use this as a measure of safety (although it involves profit too, but Im too lazy to copy the Drawdown of 149 systems one by one).

%Win: I use this as a measure of safety too, based on the idea that a drawdown due to a long series of losses is unlikely when the win probability is high.

Return/trades: I use this as a crude measure of slippage insensitivity. The idea is that 1 trade of $1000 is better than 10 trades of $100 in terms of slippage and commission. A better estimate would also involve volatility, liquidity and profit per share, but these are not available to me.

Now I made two plots: One for Annualized return versus Sharpe, and the other for %Win versus Return/trades. Obviously, there are other plots possible too.

Both plots exhibit the pattern that the right-above area is empty. I identified the best line visually from these plots. The systems on it are:

Annualized return versus Sharpe:

System %W Sharpe Annualized return Return/trade

TA Swing Trader 65.00% 3.621 65.00% 0.08%

Compounded Money 83.00% 1.865 97.31% 0.09%

Upbeat Trading System Stocks 78.00% 1.639 300.00% 1.06%

CT Global Hedge Fund 59.00% 0.844 788.14% 4.78%

Also close to the best line (in my subjective judgment) are: Tango and MBN-1.

%Win versus Return/trades:

System %W Sharpe Annualized return Return/trade

CT Global Hedge Fund 59.00% 0.844 788.14% 4.78%

Dave’s Goofiz ATM 80.00% 0.799 633.06% 1.53%

Trade Fury 82.00% 0.197 43.31% 1.06%

Tango 85.00% 1.513 105.93% 0.19%

The Compounding Daytrader 91.00% -0.3 21.84% 0.02%

Also close to the best line (in my subjective judgment) are: Upbeat Trading System Stocks, Work And Trade, and Compounded Money.

Please note that both tables are not rankings, but rather the numbers 1 according to different rankings. None of the systems within a table dominates any other system in that table on both dimensions. Thus, in the first table the Annualized return increases from top to bottom, but at the same time the Sharpe decreases. You cannot have both in the same system (but you can miss both in the same system, such systems are not listed here). Similarly, in the second table, then %Win increases while the Return/trade decreases.

There is one system that lies on the best line in both plots: CT Global Hedge Fund. Some other systems lie on the best line of one plot and close to the best line of the other plot: Compounded Money, Upbeat Trading System Stocks, and Tango. Such systems are high on at least two dimensions, e.g. CT Global Hedge Fund is good at Annualized return and Return/trade. However, that a system is on both lists doesnt mean that it is necessarily good for you. If Sharpe is important for you then TA Swing Trader may still be a better choice.

If you look to the wild equity curves of Trade Fury (max Drawdown 25.1%), and CT Global Hedge Fund (42.65%) it will be clear that much more can be said about the matter. In particular it would be interesting to this analysis with Drawdown instead of Sharpe and %Win, and I think that better measures of slippage sensitivity are possible. Everyone can do that for himself with some effort. It may be interesting to see where your system lies in the plots!

Jules

Pete · March 4, 2006, 9:36am

Great work Jules, thank you for sharing your work.

PalAnand · March 4, 2006, 11:44am

%Win: I use this as a measure of safety too, based on the idea that a drawdown due to a long series of losses is unlikely when the win probability is high.

A valiant attempt in general. But I must point out that the above is a fallacy; % wins whatever this number is irrelevant because it is the size of the average win vs average loss that matters; i.e., the profit factor (win/loss ratio) or expectancy; as it is highly corellated with it.

This fallacy arises due to confusing frequency with magnitude of wins and also could be the reason behind the prejudice against trading options, as I once had. 90% of options expire worthless, i.e, 10% wins, 90% losers, yet many have made fortunes trading options.

ps: Profit factor (W:L ratio) expresses the worth of a system in terms of a ratio, and Expectancy expresses it in terms of dollars. Expectanccy Score is a much better measure to rank systems. That is what I use to price the systems I develop.

JulesEllis · March 4, 2006, 2:19pm

That seems a good point to me. I’ll await other suggestions and then come back with some improvements.

Jules

Peter · March 6, 2006, 3:30pm

Great effort Jules and I love the low dd but some were still pretty untradable due to very low return after commissions -notably the stocks systems which relied on 2-3 cents per trade…

Peter · March 6, 2006, 5:48pm

Another obvious method would be to rank the systems based on subscriber reviews over,say, the last month. Highly star rated system should be easily accessable …

One system I like is the Black dog at the moment -nice market entries, few trades, very liquid market, no scalping or adding to losing postions and decent returns -just a shame about the overnight holds. No system appears perfect unfortunately…

JulesEllis · March 6, 2006, 6:30pm

It may be better to use the profit/unit as a measure of slippage insensitivity - thus punishing systems that rely on a few cents per trade. I didn’t use this because these numbers are not given on the overview pages and I didn’t want to copy them one by one for 149 systems. (But for a final analysis this might be worthwhile)

My impression is that, in general, systems with high return/trade do not rely on a few cents. A high percentage return per trade with a few cents difference is only possible if the price per share is low, e.g. increasing from $1.00 to $1.04. Most systems do not use such low priced shares, but some do. However, I’m not sure that slippage is such a big issue in these cases. I can imagine that with these low prices, the price is much less volatile if you express it in cents. E.g. with a price around $80 you can expect changes of 3 cents or more within a few seconds, but will this happen as easily with prices around $1? I’ve little experience with this, but my impression is that the answer is ‘no’.

So I might replace ‘Annualized return / trades’ by ‘profit per unit’ with some effort, but I’m not sure that it is wise to do so.

You say that some systems are still pretty untradable. Well, we are talking about rankings here, so this is always relative to other systems. That a system is a ‘number 1’ in a certain ranking doesn’t mean that it is ‘good’, it may just mean that all similar systems are ‘worse’. So I cannot change that within this concept.

It may be desirable to change the concept such that it is not a (relative) ranking but rather some kind of (absolute) tradability measure. However, given the discussions about the Realism factor, and my limited insight in this so far, I don’t see this happen shortly.

HansHansen · March 6, 2006, 7:12pm

I think you’ll find that “profit/unit” to be a bit unreliable, as this figure has different meaning in C2 depending upon whether you’re doing stocks, forex, or futures. And I personally think there are some inaccuracies in some system’s stats for this metric.

RandyMay · March 6, 2006, 7:23pm

Jules … I didn’t read every detail of your original post, but what jumped out at me was that CT Global Hedge Fund could pop up as anything but bad under any criteria for rating a system. That system trades huge numbers of shares of penny stocks (sometimes more than a million !!) and may look good with C2’s bucket shop fills, but could not be traded in a real account even starting with $10K. I’m sure this is a major cause of the extremely low RF, but it must be one of the most untradeable systems on C2.

JulesEllis · March 6, 2006, 7:44pm

Randy,

I am not so sure - but this may be a lack of knowledge on my side. Why would it be so untradable? If the price is so low, why would there not be huge volumes of it? The low RF as I understand it means that there weren’t many sold at that time, but it doesn’t necessarily mean that they were unavailable.

However, this would then become a discussion about the validity of the RF factor, which I try to avoid. I suppose your problem would be solved if I replace the return/ trade by the RF factor?

Hans,

The difference in meaning could be solved by doing the analysis seperately for stocks, options and forex systems. This seems wise anyway. There is a problem with some systems that do both stocks and options then (this problem is present in the above analysis too).

I don’t know of the inaccuracies you mention. Obviously, if they exist, I cannot solve them.

Jules

RandyMay · March 6, 2006, 9:25pm

These penny stocks are so thinly traded that moving any kind of volume (in terms of dollars) would run the price all over the place. For example, DHPI only trades an average of about 116,000 shares a day. At a price of 7.5 cents/share that is only $8700 a day total trading volume. If you tried to place an order for 1,150,000 shares (one of the trades in the system) you’d never get it filled at anywhere close to a reasonable price, if at all. Even 1/10 that number of shares would equal the total average volume in a day and run the price up considerably beyond what the C2 fills show.

DHPI is an extreme example from the system table, but these penny stocks are not liquid enough to trade in any kind of volume without creating a huge spread that would negate everything in the C2 tables completely (hence the low RF). You may be able to accumulate a position over many trading days, but that would most likely make the C2 comparisons even worse.

HansHansen · March 7, 2006, 12:50am

Jules: (quote)

"Hans,

The difference in meaning could be solved by doing the analysis seperately for stocks, options and forex systems. This seems wise anyway. There is a problem with some systems that do both stocks and options then (this problem is present in the above analysis too). "

I’m just not sure about these figures. My system has, VERY round numbers, $40K profit to date, and around 100 futures contracts traded total (again very rough number). That should be $400 per trade. Instead it shows something like $88. So I don’t know where this number comes from. Be careful here, as you could misrepresent something, either for better or worse!

Stock systems probably should be graded more upon % return per trade rather than $/unit, as a trade involving a $100 stock will be represented differently than a $1 stock.

It’s a real complicated metric, as there are so many variables, and trying to convert apples to oranges for comparison is difficult, if not impossible.

Hans.

JulesEllis · March 7, 2006, 2:55pm

??? I’m quite sure that I replied but this post has disappeared. So again:

Randy,

Thanks for explaining!

Hans,

Perhaps you’re right, but there is no way that I can change the C2 statistics. So I think the only workable solution is that I use the C2 statistics as they are. If someone disagrees with these statistics, he should talk to MK. You suggest %return/trade, but see the point that Randy raised. Perhaps the Realism Factor would work for both of you?

Jules

HansHansen · March 9, 2006, 3:38pm

A further example of why you/we need to be careful what statistics we use: On a system refresh that just happened a few minutes ago, my $prof/trade shows a $(-3). Yup, negative, even tho both my closed and open trades are, in aggregate, positive.

JulesEllis · March 9, 2006, 6:12pm

Hans,

I see your point. I did the same calculations for the closed trades on the first page of your system (as far as they are available for nonsubscribers). There the total profit is $4554 but the average return per unit is -74,70. So if C2 reports similar outcomes for all your trades together, this doesn’t mean that the C2 statistics are wrong.

It rather means that our intuition about this statistic is wrong. In the case of your system I think that I can explain what happens. Most of your trades on the first page are with 1 or 2 contracts. Most trades with 1 contract have a loss, while most trades with 2 contracts have a profit. The total profit of the 2-contract trades is larger than the total loss of the 1-contract trades, thus yielding an aggregate profit. When the average price per unit is computed, the profit of the 2-contract trades is divided by 2. The sum of these divided numbers is smaller than the total loss of the 1-contract trades, so the unweighted mean of the profits/unit will be negative.

I agree with you that this is an unexpected outcome, and that these phenomena make it hard to see how profit/unit can be used as a measure of slippage-insensitivity.

HansHansen · March 9, 2006, 6:47pm

Hmmm… I see what you are doing, but don’t understand why. I would think that the metric would be computed by taking the total profit overall divided by the total number of contracts traded. Without digging up the real numbers, this would be somewhere in the area of $40K / 100 = $400. I can’t see why it would be done differently, but oh well!

JulesEllis · March 9, 2006, 10:14pm

Good question. I merely tried to describe the way C2 is (presumably) computing this statistic. But I admit, I would probably do it the same as C2. My (perhaps too simple) thought behind it is: To get an idea of how slippage-sensitive a single trade is, the profits of that trade must be divided by the number of units in that trade. Then, to obtain an overall measure, I would average that across trades.

‘Your’ method would be to take a weighted average across trades, where the weights are the number of units. I agree that this makes sense too, if not more.

In the present method, a trade with one share and 1 cent profit would have the same weight as a trade of 10,000 shares with 1 dollar profit per share. This would yield an average profit/unit of $0.505. Splitting the last trade into two trades of 5,000 would suddenly increase the profit/unit to $0.67, although these three trades are essentially equivalent to the first two trades.

Yes, I think you’re right and that your method is better. But I still cannot compute it from the information on the system pages. So you’ll have to ask MK…

Jules

MatthewKlein · March 9, 2006, 10:14pm

Actually, the statistic looks at a each trade individually: it divides the dollar-profit of each trade by the number of units in the trade. Then, we take these profit-per-unit numbers and average them.

This number isn’t terribly meaningful for systems that trade different kinds of instruments, or even for systems that trade lots of different kinds of futures contracts (with various point values per contract), but for systems that trade just one thing (an @ES Emini trading system, or a stocks-only trading system), this statistic has value.

As an example: If you see that an E-Mini S&P trading system makes, on average, $5 dollars per trade, per contract, you know that – ultimately, with any kind of commissions and slippage – you’ll break even at best. Similarly, a stock system that makes $0.05 cents a trade won’t be terribly profitable in real life – not with any kind of spread and a discount broker’s commission factored in.

MK

JulesEllis · March 9, 2006, 10:21pm

Matthew,

It seems we were writing at the same time…

Your interpretation and usage of profit/unit would still be valid with the formula that Hans suggests, don’t you think? So after reading your post I still think that the weighted average would be better.

Jules

FanusS · March 9, 2006, 10:45pm

I think the current way is better than to calculate a weighted average. Say there is a system with three trades as follows:

1 contract for a loss of 100.

4 contracts for a profit of 400

1 contract for a loss of 100.

The weighted average would calculate the PL per Unit as:

(-100 + 400 - 100)/6 = 33.33

The current method would calculate it as:

(-100/1 + 400/4 -100/1)/3

= (-100 + 100 - 100)/3 = -33.3

Someone who can only afford to trade 1 contract at a time, might look at the P/L per contract to see what kind of profit he can expect. Based on the weighted average, he might assume he can expect +33.33 per contract. But then when he actually start to trade the system with single contract per trade, and the sequence above repeats, all of a sudden his PL per Unit is -33.33 where C2 will show it is a +33.33 and he will be rightfully unhappy.

I think the current calculation gives an accurate account of what one really can expect if you trade one contract, or stock at a time, or want to project your expected profits if you trade more than 1 contract a time.

Regards

- Fanus

Topic		Replies	Views
Best Systems C2	14	180	April 2, 2005
Subscribers need protection from bad systems C2	22	165	January 22, 2006
Annual return C2	26	292	October 7, 2008
Hot Lists C2	57	287	November 15, 2005
Statistics issues C2	8	114	March 25, 2005

An attempt to rank systems

Related topics