A strategy that works on SPY, but not quite QQQ, overfitting?

Say that I have developed an intraday strategy. And I found that it achieves a Sharpe ratio of 1.7 on SPY but only 0.5 on QQQ. What do you call this? Overfit?

I can further tune some parameters, so that the Sharpe of SPY goes to 1.1 and QQQ goes to 1.0. Is it now better or worse than before?

I think the reason the algo performed better on SPY is that sector rotation often causes mean reversion so that you can short at a high point and long at a low point. QQQ is one sector. QQQ goes one direction intraday and will not look back. The performance is related to the rotation and timing behavior I’m trying to model after.

Monte Carlo is not going to help because the traits I’m trying to identify only exist in real data.

What options do I have for validating this algo and calling it good?

Does it work on DIA or IWM?

Haven’t got the minute bars to test yet. But according to my past experience, DIA is generally similar to SPY. IWM on the other hand may have a worse Sharpe than QQQ.

What about trying walk forward testing on just SPY? What is a Walk-Forward Optimization and How to Run It? - AlgoTrading101 Blog

I don’t have a technical implementation of walk-forward. But I noticed that the Sharpe stay relatively stable (except for Mar 2020, during which the return is high) in random one-year windows.

It’s not machine learning, I couldn’t really take out any rules. Parameters also haven’t been optimized to the extremes. The histograms of daily low/highs have been stable over the years. Daily lows are more likely to happen in the first half of the day. Which is the clue for me to guess the entry. Only problem is that QQQ is more volatile and less trendy than SPY.

1 Like

If your system is profitable then the Sharpe ratio is a rather meaningless metric, because it penalizes both the downside and, unfortunately, the upside volatility.

The Adjusted Sortino Ratio.is much better and will give you the real value of your system, try it.

1 Like

Thanks for the idea. Sortino is good, although in my particular case, it’s not that different from Sharpe considering the strategy is in fact performing poorer in QQQ compared to SPY.

Never trust backtested results. I’ve seen hundreds (if not thousands) of systems blown up after they started to test it live claiming fabulous results.

If its never forward tested you will never know if it really works.

I figured “blow ups” are mostly due to leverage. If some strategy seemingly provides stable returns at a very high leverage, there is usually some tail risk event causing it to blow up.

But I don’t agree that “walk forward” is the remedy here. In my experience, if I try hard enough, I can definitely overfit out-of-sample data. And life is short. You don’t have endless time for testing. You certainly won’t run an algo thousands of years like a Monte Carlo test. Any strategy could work better in a certain market condition than others. I think it’s a judgment call and everyone is entitled to their own opinion.

One thing you can do is just do the past year as walk-forward, and see if the results are still good. Maybe use 6 years ago to 1 year ago as your data, and fiddle around with the model until it produces good results. The run the model from data from 1 year ago to now. Maybe also fit model to QQQ, and see how it performs on SPY.

But how much confidence do I have for 1 year walk forward results? 2023 feels drastically different from 2022. I have a hunch that the best strats in 2022 would not work in 2023, and vice versa. I feel like a mediocre return for both symbols is less overfitting.

I like to look at 1 year rolling returns, 1 year rolling profits pet trade and maximum number of weeks before system makes new highs to test for consistency and how likely I am to stick with it.

2 Likes

Many years help to form many different market conditions. A backtest from 1999-2020 had to go through many different market phases. Then a forward test from 2021-today, so that a backtest can be validated well.

Thank you for all your input. Although it seems that the only options available: backtest and walk forward are still pretty limited to give a full picture.

1 year walk forward test will reflect how someone who make the model on 6/30/2022 would have done the following year. “2023 feels drastically different from 2022” this is why people with good backtests fail when implementing system.

2 Likes

Basically you have to ‘know’ when your system works well and when it doesnt work well. Why does it work on the spy and not the qqq for example.

Go thru the results and find out when the system is making good trades and when the system is making bad trades.

For example a moving average system works great in trending markets but works terribly in ranging markets or whipsaw markets.

Hope this helps.

2 Likes