Evaluating a Trading System

Evaluating a Trading System

What before how: The goal of evaluating a trading system is to forecast its future performance.

With that in mind, here are three simple criteria for quickly evaluating a trading system. A system which passes all of these criteria is more likely to be successful over the long term than one that does not:

1. Review the backtest history and make sure it spans several market regimes and was consistently profitable across them all. With the amount of market data available today, there is no excuse for not having this information.

2. Calculate the t score of the system to see how statistically significant the results are. Generally, try to get 100 observations and a t score greater than 1.6. Higher is better. Over 3 is very good. Be wary of results less than 2 and walk away from anything less than 1.6. Make sure a reasonable slippage and commission have been removed, especially from high frequency systems.

t = sqrt(number of trades) * avg trade / std deviation of avg trade

3. Calculate the optimal f$ value. Optimal f$ is important because it tells you the maximum leverage (in contracts per $) the system can be traded w/o greatly increasing the odds of risk of ruin.

f = (((1 + win loss ratio) * prob winning) - 1) / win loss ratio

f$ = largest losing trade / f

[Note: Max drawdown can be substituted for largest losing trade as every traders ability to withstand drawdowns is suspect.]

Having passed the above tests, control charts and monte carlo simulation can be used to further investigate a system.


A System Forecasting Tool is available here:


Good points, with a couple of points:

1. The only backtest that is worth anything is a walk forward backtest. Most backtests I’ve seen presented here or anywhere are optimized backtests, generated with the benefit of hindsight. Those are pretty much useless. If it looks too good to be true, it probably is.

2. If you try trading a system with optimal f, you are almost guaranteed to lose eventually, since your largest drawdown is almost always yet to come. Money management should be trading not to lose everything, without worrying about potential gains. This implies a much lower leverage the optimal f.

I agree with your comments, backtests are fairly irrelevant in my opinion, walk forward is what counts, I’m staggered by how many people ask for backtest results as if that is going to prove the system works, you can pretty much make a backtest show whatever someone wants to see.

Re T score I would have no problem with that being added to the stats on the front system page.

Henry this post is a great help to everybody.


Rick Haines

The fact you are only using this to redirect people to your site pretty much dashes your credibility/objectivity. As your site says

"This strategy is available for licensing by funds and money managers. Contact Henry Carstens for more information. "

The fact you are only using this to redirect people to your site pretty much dashes your credibility/objectivity.

I don’t have any problems for a vendor to point to his related web site. However, for people who have not visited his Rembrandt system on C2 beforehand it was not clear that this is his own site. So, a better disclosure would have been in order.

I’d just laugh at a 1 week old vendor coming out and saying this stuff… just to get them to his site, which probably houses the backtests that he does not believe in.

Buyer beware. A backtest is useful as long as it is more than 2 years old with more than 30 trades. Normal population, and sufficient data over different market periods. Ideally 5 years or more.

PTQQS has not changed since August of 2007.

well, the purpose of C2 is running it LIVE under the lights, where one cannot say "gee, I meant to take that trade, but…" or many other excuses used by people on their private website.

Backtests mean zip. Live tracked results with metrics mean everything.

I thought it odd when a stranger walks in, telling everyone "use this smell-o-meter to see whether a system is any good…" and not bothering to tell people that "by the way, the authoritative site I am pointing to is mine…"

Not a particularly good start…

In my opinion, the only way conventional backtests (ie, non-walk forward backtests) can help you is if they make you say "wow" after looking at them. In that case, they are probably curve fitted or otherwise optimized.

Good looking backtests are one of the easiest things in the world to create.

Brilliant, professional work.

Thank you, Henry.

I wish your knowledge be a part of Collective2.


To: Index, Beau, Kevin:

Are we not on Collective2, guys?

Grab C2 forward-walking historical data, change the word “backtest” in Henry’s text to “C2 data” and start to play with C2 systems.

Good luck to all,


Henry’s original post extolled the virtues of backtesting. I am merely pointing out shortcomings of backtesting, which is why C2 is so important for weeding out the good from the bad.

Your advice for using C2 data is definitely right on.

Thank you Bob Svan for bringing some clarity to this discussion.

And this comes from a guy who has an excellent track record of his systems as opposed to most of the other guys participating in this discussion.

True. Don’t listen to anything I say. I have a terrible track record:

C2 Systems:

5 of 8 are profitable (3 with >40% annual return)

2 of 8 are losers (both <5% annual loss)

1 of 8 is big loser

Real Time, Real Money Futures Contest (google “world cup futures challenge”):

2005 World Cup of Futures 148% return (second place)

2006 World Cup of Futures 107% return (first place)

2007 World Cup of Futures 112% return (second place)

Please check your facts before throwing stones at me. If what I have to say is so meaningless to you, just put me on “ignore” - you’d be the first…


The point is, please don’t dismiss anyone’s view based on a good, bad or non-existent track record. I judge comments on the quality of the post.

Some of the best advice I’ve gotten was from author/trader/hedge fund manager Victor Neiderhoffer, and we all know he crashed and burned twice.

Of course, PTQQS can’t be outperforming by 8400 basis points since March 20th, 07…

Kevin, why are you so offended? I did not single you out, remember I said “most”. In fact I have two of your systems on my Analyst page, they show promise, they just don’t have a long enough track record.

Now another question and I hope you answer it here on this forum: Do you consider Beau Wolinsky’s present system a good system? If one had started trading it at the height of the equity curve in June 2008, that is about one and a quarter year after incepetion - and is not this about the time a conservative person would wait before jumping in - and had started with the recommende capital of $100,000, that person would be down now about 50% and he would need a 100% return just to break even, a return the system has never produced so far. If I had to choose between his system and a new system which has a credible back testing history over many years with a maximum drawdown of 9% I think the choice would be easy. Am I missing something here?

I would be interested in your answer because you seem to be an experienced trader.

I get offended when, in the midst of a meaningful discussion (which are not all that plentiful on C2, by the way), someone decides to effectively end it by saying “ignore most of the posters, because they have what I think are poor systems.” I get even more offended when I happen to be the most frequent poster in the thread where this ambush occurs.

As far as evaluation of any particular system goes, I would have to look at a system as part of the portfolio of systems I trade, and ask myself “would this fit in with my other systems?” A system with 20% annual return and 30% max DD could fit my objectives (especially if it reduced overall portfolio risk and drawdown), and then again it may not. For example, would I ever trade a coin flip system with a negative expectancy? I sure would, if it was traded in the right kind of portfolio. (So, you’re probably saying “thanks for the non-answer,” but sometimes the answer is more complicated than it appears).

So, you may ask, how can I say this and still “review” systems on My Analyst page? Simple, the systems I criticize on that page (which are not all the systems I review there) should not be traded by anyone, anywhere, anytime. I am trying to warn people about the systems that may look good at first glance, but are really just ticking time bombs. Since right now most of the top voted comments are mine, I think I am successful in this.

Finally, if I had to choose between a system with history on C2 and one that had no C2 history but had a backtest, I’d choose the C2 system, since at least I knew what I was getting, and could make an informed decision on whether the system met my objectives. Relying on traditional backtests is not for experienced traders.

Right, because people who are successful in life never have any failure first do they? So you should always only take advice from people without any failed C2 systems to their name. Good luck.

Well said, much better than my attempt!