Create your own custom Leaderboard

Many C2 users have offered feedback about what they like (and don’t like) about C2’s official Leaderboard. Now, C2 members can create their own leaderboards, and can share their creations with other users. Users can comment, ask questions, and vote on Leaderboards they like.

This new Custom Leaderboards feature is currently in beta. Let me know what you think.

Matthew

4 Likes

@MatthewKlein I love the idea and I appreciate how you are always working hard to keep improving the platform.

I gave it a quick try and got stuck. From the page you gave I clicked “Make your own Leaderboard formula”. I then created a formula, previewed the results and clicked the “Save” button. I checked the “Make Public” box and clicked the “Save” button. I’m missing the next step as to what I need to do to share my new leaderboard.

Thanks!

Gary

Hi, Gary -

Good feedback - I definitely need to make the entire process more clear for “Leaderboard Creators.”

Short answer: You don’t need to do anything else, other than what you did (click “Make Public” and “Save.”)

Longer answer:

  1. It can take a few minutes for any newly-saved Leaderboard to get calculated and refreshed.

  2. Even after that brief delay, only those Leaderboards that compile correctly (no errors) and that give meaningful data will be included. “Meaningful data” means: there has to be actual “dispersion” among the strategy rankings. To take a simple example, if your formula creates results such that every strategy has the exact same mathematical score, there’s no dispersion, and the results are meaningless; and so the leaderboard will not be made public to others.

As your comments suggest, what’s missing from the current version of the Custom Leaderboards is that you, as a leaderboard creator, have no visibility into this dispersion measurement, and are not explicitly informed of status. I will be improving the software shortly to handle this.

I see my two entries listed – that’s great!

One other thing that’s not clear or may be missing – deleting or editing a leaderboard.

Gary

I’d like to share my Leaderboard with the group in hopes of getting some feedback on ways it could be improved. Here is the link:

https://collective2.com/lb/319

Here’s the formula:

(%[Age Strategy (Days)]% + (%[Trades Own Strat Certified? (0, or record guid)]% > 0 ? 720 : 0)) * ( %[Sharpe Ratio]% / %[Avg(MAE) / Avg(PL) - Winners]% ) * ( %[Delta Equity 180 Days]% > 0 ? 1 : 0)

The basic idea is to give credit to those systems that have:

  • A long track record
  • TOS Certified
  • A high Sharpe Ratio
  • A low ratio of MAE to PL for winning trades (i.e., it doesn’t incur large drawdowns to get winners)
  • Profitable for the past 6 months

Gary

By the way, there’s a built-in “Discord-like” real-time chat feature next to each Custom Leaderboard formula, where a discussion (like the one you started in the post, just above) can take place. I’m not suggesting these “official” forums are the wrong place for such a discussion. I’m not really sure which is better. So we will let the market decide.

Thanks @MatthewKlein for this, great idea! What I miss and would like to see is to be able to apply comparison of systems in a ‘time boxed’ manner. For example what @GaryLynn2 created as his leaderboard, but to apply the rule on only trade data of systems only from a time period. eg. give me the best systems (‘best’ defined by user) from 01/01/2022 to 31/12/2022 by comparing statistics only from the trade data from this period( I understand this can be challenging technically because you need to calculate (new) ‘system’ statistics only for that time period and then apply the ranking. But this would be really cool and valuable to have in my opinion. ). Thanks once again!

Why aren’t all strategies shown in the Custom Leader Board? Are the placements correct in the first column with the #? Why are there gaps? If I go to the formula itself, I see all the systems.

PS I would also like to have a profile picture here in the forum, but I can’t find the settings. Maybe someone can help me via PM?

Actually, this feature exists!

However…

Right now, it is only available inside the C2 Scoring Workbench.

So if you want to access it before I can do further development to make it more obvious, here’s what you do:

  1. Go to the C2 Scoring Workbench
  2. Click “Public Formulas”
  3. Look for Gary’s formula, and load it:

  1. Now you will have your own copy of Gary’s formula. You could do things to it to improve it. (Or not!)
  2. Once you load the formula into your Scoring Workbench, under the formula text, you will see “Backtest.” Click it!

  1. When you click “backtest” you’ll be able to specify two historical dates. When you backtest, you can go back in time and see the results of Gary’s formula as of the first date. You will now have the top 25 strategies as of that date (for example, February 1, 2021).

  2. You probably wonder: So is Gary smart? Or is he a doofus? Let’s find out! The backtest feature lets you specify a second date, for example March 15, 2021. Now you can see what would have happened to those strategies that “Gary picked” as of February 1. You can see if they would have continued to do well as of March 15.

  3. Pretty cool, right?

  4. Yes, this is just a part of what our Quant Team does when developing the C2Score. Now you can use it for your own scoring.

2 Likes

Hi, Fabi - Can you PM me with an example of what you think is missing? I’m not sure I understand what you mean by “gaps.”

thanks for pointing that out. Definitely a useful feature and I just tried it out. If I understood it correctly, this is useful for backtest. ( define the criteria, apply it on ‘Score as of’ date and see what happens with those systems on ‘Results as of’. But what I am looking for is the ability to timebox to the criteria itself. (i.e I want to see the systems that did well on the ‘Score as of’ date , but the criteria itself should look at only trade data of systems from a startDate and endDate. Not sure if I am making it clear!

May be an example would help. I want to see the systems that have the highest sharpe ratio in the last one year amoung the systems that existed for at least 1 year… The caveat is I want the sharpe ratio of systems to be calculated only from the last one year’s trade data(ignoring all trades beyond last one year). I am not sure how scoring work bench can do it.

That currently is not possible at this time.

You’re basically asking to be able to go back to any point in time in history and then calculate any statistic … but to calculate that statistic arbitrarily using only data from two indeterminate points in time (so long as those two points are before the first arbitrary point in time).

Right now, C2 calculates historical statistics for all strategies, and for all times T, but those historical stat calculations use all available data – i.e. the “backtest stats” for any time T are calculated using data from the beginning of the strategy, up to the end of time T.

yup understood. Thanks. I also thought this feature does not exist yet. My motivation here is about having the ability to compare a recent system that has only 6 month track record, to a system that existed for say 5 years. Now to make it an apple to apple comparison ( at least in my opinion as a user), I want to compare them with only the statistics made from the trades of only last 6 months. This is what I was intending. Hopefully will come in the future !

Sure. I just want to point out that, in theory, you could use the API to download raw equity or trade data, and run you own calculations.

I’m not trying to be snarky - I definitely want to hear what customers think is missing - so thank you… but there may be times when I suspect you could develop stuff yourself much faster than it would take to wait for C2 to build it as part of the platform.

yup sure. I am also not sure how many customers think like me in this regard. May be there are no one else for it to be a feature request! I am here more as a trader leader than a subscriber, so perspectives can be different
Appreciate all the constant improvements you are bringing on to the platform.

I couldn’t go to bed tonight without answering that question. I followed the steps that @MatthewKlein outlined by starting at the beginning of 2020, testing 3 months later and recording the returns for the top 5 systems in my Leaderboard. I rolled the dates forward 3 months for 11 more times to give me a walk-forward test of the strategy over the 3-year period from 2020 through 2022. Here are the results:

The bottom line is that the strategy more than doubled the return of the S&P over that 3-year period. I’m sure the drawdown was much less also.

I can sleep well now knowing that I’m not a doofus. Whether I’m smart or not is for others to decide. Let’s just say I’m experienced when it comes to subscribing to C2 systems…

Gary

3 Likes

Okay, it’s confirmed. Gary is no doofus.

But, Gary, I bet the statistical exercise you ran and described in your previous post raised some gnawing doubts in your mind.

Namely: is the backtesting data provided by C2 “clean,” or does it suffer from survivorship bias?

For those readers who are not familiar with the term, let me describe survivorship bias. It’s a common pitfall that novice quants can fall into. Imagine you backtest a stock-picking strategy over the past ten years, and it returns amazing results. Chances are - unless you took specific action to avoid it, your results were positively skewed by survivorship bias.

How does survivorship bias infect your work? Imagine this: you sit down to run your backtest in 2023. You feed the database of all the available stock symbols into your computer program. You get amazing profitable results!

But wait. That “database of all the available stock symbols” are those symbols that still exist in 2023. Not included in your database are all those symbols that disappeared between the year 2013 and 2023. You know which stocks disappeared during that time? The stinky ones. The bankrupt companies. The losers. The stocks that went to zero.

If you had run your strategy back in 2018 (i.e. in real life), some of those companies would have been available in your 2018-vintage stock-symbol database. And your strategy would have invested in them. And you would have lost money.

But because you weren’t careful and thoughtful, when you ran your backtest in 2023, your software didn’t have any possibility of picking one of those loser stocks. They weren’t in your 2023-vintage database. That’s why your results were so great. (When I write “your” I’m not talking about Gary, obviously, but rather the mythical quant person who learns the hard way about survivorship bias.)

Anyway… that long explanation about survivorship bias is the necessary preamble for me to talk about Collective2’s Scoring Workbench.

When I built the C2 Scoring Workbench, I worked to avoid survivorship bias. Specifically, I made sure that any strategy that existed, and was “subscribable” in the past, would be included in the database that is used by the backtester. That includes “killed” strategies. And it includes strategies subsequently made private. In other words, if a strategy was available for subscription in 2021, for example, then the C2 backtester would include it in the tests/analysis – even if the strategy no longer exists as of today, when the test is being run.

So the C2 Score Workbench does not suffer from survivorship bias, right?

Well, there’s still a subtle problem that quants have a hard time overcoming. It affects all quants and backtesters, not just people who use C2 or who poke around on the Scoring Workbench.

What is the problem?

The problem is that you know.

You know what happened. You know there was a financial meltdown in 2008, and that real estate stocks and banks tanked. So necessarily, you never even bother to create a “long real-estate strategy” and run a backtest for the past 20 years.

You know that Covid happened. You never even bother to create the “buy travel and airline stocks” strategy and run that test over data including 2019-2020.

There are dramatized examples, of course, but the same issue happens in less obvious ways. Like this: you sit down in front of the C2 Score Workbench for the very first time, and you play around with a couple formulas and see how they perform on today’s data. The first two formulas are stinkers, which you attribute to the fact that the software is complicated, and you have no clue how to use it. But the third one returns a really nice leaderboard! And so you backtest it. And it works really well over a 12-month window.

But there is a subtle survivorship bias in this. You started you work only after you came up with a formula in the Workbench that happened to pick today’s winners. It was more than likely that the same formula would return the same strategies, and these strategies were also winners a mere 6 months ago. Etc.

Anyway, I’m not trying to cast shade on Gary’s work here. Frankly, I’m always surprised and delighted when someone bothers to sit down and actually use my software. And it does seem like Gary’s formula returned good results!

But I just wanted to offer two pieces of information here:

  1. That I am aware of the possibility of survivorship bias, and I designed the C2 Scoring Workbench to avoid the most obvious kind. (i.e. When you backtest, all the “failed” and “disappeared” strategies are in the historical database.) That’s the good news.

  2. The bad news is that, even so, it’s still very hard to be a human being and try to design a trading strategy that doesn’t suffer from a subtle form of bias, because you kinda sorta know how things turned out in history, and your brain will naturally gravitate to create and test strategies that happen to do okay in the world that transpired.

Matthew

6 Likes

@MatthewKlein I always enjoy reading your responses – always very thorough, well-articulated and usually a little humor thrown in. For those who didn’t know, MK wrote novels in his spare time a few years ago (see: Amazon.com: Matthew Klein: books, biography, latest update). I read a couple when they were first published – I remember really enjoying them. I never could figure out how he had time to write novels while building C2.

Back to the topic, I agree with your assessment that there is some survivorship bias in these results. I would have liked to run the test further back but I didn’t have the time or the patience to compile the results.

This brings me to an obvious feature request – providing an automated walk-forward testing of a strategy like this. I seem to recall you had something like this, but maybe it is a premium feature?

Gary

1 Like

Thanks for the plug, Gary.

Your feature request is a good one, and something I’ve wanted to get around to building. It does not yet exist. But I’ll see what I can do.

Matthew

At the suggestion of a fellow C2 user (hat-tip to @GMan2020) I created a version of the scoring formula that breaks down the score into a weighted sum of normalized metrics, which makes it easier to understand and tweak.

Here’s the leaderboard based on this formula:

https://collective2.com/lb/321

Here’s the formula:

0.20 * min(1.0,(%[Age Strategy (Days)]% / 1800)) +
0.20 * (%[Trades Own Strat Certified? (0, or record guid)]% > 0 ? 1 : 0) +
0.20 * min(1.0,( %[Sharpe Ratio]% / 2.0 )) +
0.20 * (1.0 - min(1.0, %[Avg(MAE) / Avg(PL) - Winners]%)) +
0.20 * min(1.0,( %[Delta Equity 180 Days]% / 20.0))

Note that I clipped some of the metrics to max values to avoid having outliers skew the results. For example, I figure a system that is 10 years old is probably no better than one that is 5 years old so any system 5 years or older gets the full 1/5 of a point towards their score.

My hope is that this template will be useful to others and that they will share their findings.

Gary

2 Likes