C2 data mining for fun and profit

TrendNeutral · August 19, 2016, 12:10pm

Hello, to all miners out there, especially those familiar with Explorer.

Given all the data available about failed systems at C2 as well as systems that are still performing well it would be great to mine Explorer data to look for any statistically significant difference between two sets of systems:

stats of systems that ended up dying (i.e., their stats while they were still doing well)
stats of systems that are still doing well

Are there early signals in the stats that may predict a future complete failure?
e.g., percentage win >90%, avg win much lower than avg loss, single trade DD > 20%, system age < 6 months, sharpe > 5, etc…

There are some many examples:

Has anyone ever tried to do that? If we could find some indicator able to predict systems with a higher probability of failure (and conversely those with higher probability of success) we would all be in a better place.

Best,
Amon

MachineLearningTradr · August 19, 2016, 12:34pm

IMO, someone with the ability to create a successful machine learning trading algo might have the ability to create an algo to forecast a system’s failure.

I suspect it’ll require a lot of work to do it correctly and I doubt it’d be fun. I also suspect someone with that ability is probably not concerned too much with the failure likelihood of other systems as they probably have their own system their happily trading with.

Interesting thought though.

MachineLearningTradr · August 19, 2016, 12:41pm

EDIT:

I say “machine learning” because a standard “statistical” analysis would probably not be as effective, imo.

DavidStephens · August 19, 2016, 12:56pm

A significant problem there is system developers tend to change their systems, especially when they start to underperform. There might be significant differences between what the developer was doing at different parts of the equity curve.

MachineLearningTradr · August 19, 2016, 1:02pm

Exactly. Just as the markets may change (bull, bear, chop, flat).

In both cases, the systems must be able to adapt. This is why I say a successful trading algo developer might have the skill set necessary to do it–even if not the motivation.

DavidStephens · August 19, 2016, 1:16pm

There are too many unknowns… the best I think anyone could do is to give a probability that a system is no longer functioning as expected based on past performance vs current. And people are pretty good at noticing that themselves.

TrendNeutral · August 19, 2016, 2:42pm

Thanks for contribution guys. I guess the idea was to find stats that make evident to potential subscribers hidden risks that are bound, sooner or later, to get the system crashed.

I was inspired by David’s comments on % trade profitable > 90%. Very likely to be martingale hence bound to collapse.

Perhaps with some data mining we could find other stats (or combination of stats) that are able to identify clear hidden risks in a system.

MachineLearningTradr · August 19, 2016, 3:04pm

Like some sort of Risk of Ruin:

“…The calculations required to evaluate the risk become much more complex with realistic conditions…”

In addition to what David said…there are also unknown, unknowns.

It would probably be easier to do what you mentioned as far as, for example, posting stats such as:

“Of all the systems listed in explore, 99% of them that stopped trading within 6 mos. of starting, had a winning percentage of >95% within 30 days of stopping.”

The above states a safe, provable (in this hypothetical example) fact, not a conclusion or opinion. A fact that can stand up to strong objection and scrutiny.

Notice, I didn’t suggest the stat caused the demise; I only pointed out that a strong correlation exists. (Note: Correlation does not imply causation.) Let the reader draw their own conclusions, imo.

KT3 · August 19, 2016, 3:09pm

I am sorry, but what you are saying about your system doesn’t hold any water and is really no different or more special than anyone else system that you want to bash. You can say “machine learning algo”, as opposed to “standard statistical analysis,” whatever that means (to you), but as they say the “proof is in the pudding” Plenty of folks here have come and gone on C2 bragging about their system’s prowess (in the short term) until it finally blows up in their face, so please let’s stop with the bragging about “someone who can create a machine learning algo” stuff. The community will reward you for your efforts if a system is the “holy grail” it says it is.

DavidStephens · August 19, 2016, 3:09pm

Martingale systems are detectable and are likely to blow up. Overly high %win is a good indication to look closer. Some other types of systems that take big risks to make their gains can be dectected by their leverage and/or volatility… just having a big drawdown is typical on these though sometimes they haven’t had a big drawdown yet. Look in the C2 stats under “Risk of Ruin (Monte-Carlo),” any drawdown with a percent chance listed is showing up just from volatility measures and can happen even if the system continues functioning profitably as-is. Systems using occasional high amounts of leverage (like martingale systems do) are harder to detect as C2 doesn’t have a stat to identify them that I’m aware of (if a high %win isn’t a tip off). You just have to look in the trades for occasional build up of leverage–what risks are they taking to make their gains and are you ok with that?

But solid systems that suddenly stop working… I’m not sure how to detect those except after they have stopped working.

MachineLearningTradr · August 19, 2016, 3:21pm

I wasn’t bragging. I gave my opinion.

My general opinion is that machine learning performs better than statistics for forecasting. I thought that would be understood as machine learning is basically an advanced form of statistics.

I don’t know whose systems are based on machine learning and whose are based on statistics. I really don’t care and didn’t have that in mind when I posted. But I won’t walk around hanging my head because some here are so sensitive with regard to others discussing, generally!, the performance of systems.

DavidStephens · August 19, 2016, 3:36pm

Any type of system might one day fail, they all have the same flaw of finding some edge from the past they hope keeps working into the future. Statisics, machine learning, whatever, it doesn’t matter how the edge was found, they all have to hope the future looks like the past in some way. The trouble is things change and past performance really is no guarantee of future performance. The longer you’ve been around the more likely you are to realize the best way to keep from looking foolish later is to not make claims about your system that the future might not realize for you. Be glad when your edge is working but don’t expect that to always be the case.

MachineLearningTradr · August 19, 2016, 3:37pm

Also…

I haven’t posted in these forums for weeks.? If I were about bragging, I wouldn’t stop posting here. I’d post everyday and included badges in my posts. Not once today did I mention my returns. I would if I were all about bragging.

I believe I have knowledge to contribute here (Again, I’m not bragging ), but it’s posts like yours that really make me want to just exercise my right to remain silent here and only communicate with my subscribers. Maybe that’s what you want.

GSPTrader · August 19, 2016, 10:36pm

I think longevity may be the best predictor of success. Using current listings (Futures only):

Total systems listed 325
Profitable after 3 months: 122 62% attrition
Profitable after 6 months: 70 43% attrition
Profitable after 9 months 53 25% attrition
Profitable after 12 months: 41 23% attrition
Profitable after 18 months 26 36% attrition
Profitable after 24 months: 19 27% attrition

I think it is fair to say that some of the one year survivors went off to “greener pastures” or left Collective2 over the frustration of seeing promotion and subscriber money directed to the latest unproven systems.

Since developers pay for six months in advance, it seems prudent to wait at least that long for a system to prove itself.

MTimes2 · August 20, 2016, 7:11am

Agreed this would be an interesting, albeit time consuming exercise. Personally I would be reticent to invest the necessary time though. I suspect that the results of this work, if it were undertaken, would simply verify your original premise. Rather than go to all that trouble, I would suggest that simply looking at the stats you have pointed to in your original post should give you all the information you need with respect to the future likelihood of system blowup. Good question though and should be well noted by anyone new to C2 considering risking their hard-earned for the sake of paying someone without a conscience their monthly subscription fee.

TrendNeutral · August 20, 2016, 7:26am

In the entire database only 15 survive the below criteria… It would be interesting to know if they will still be here one year from now…

TrendNeutral · August 20, 2016, 8:08am

Definitely number of trades is key factor, not just timing/age. Look at ETF Timer. It was such at hot system in 2009-2010. After 780 days an amazing performance (but only 17 trades, with scaling).

An this happened afterwards…

MachineLearningTradr · August 20, 2016, 8:48am

Perhaps the goal should be to know when to follow a system; and know when to stop following a system. Rather than trying to find a system that will last a lifetime.

Sorta like trading stocks vs. buy-and-hold.

TrendNeutral · August 20, 2016, 9:06am

Agree. Unlikely any system will last forever… so when to follow and when to stops are key decisions. Ideally best to be implemented in a systematic way rather than emotionally as it often happens: new equity high ==> subscribe ==> drawdown ==> unsubscribe (buy high, sell low)…

MachineLearningTradr · August 20, 2016, 9:31am

LOL…right.

Have you considered using T/A as if the balance chart were a stock chart. I mean, like using a stop-loss exit strategy; or a reasonable “lower low” exit strategy?

There can also be a point where you stay subscribed, but don’t trade it. And there can be a point where you unsubscribe.

System data can be downloaded. An analysis can be run to find the optimal draw down from recent highs where one should stop following; and the optimal percentage rise from a recent low where one should follow again. This would be following the trend.

The same can be done to find the optimal percentage in which to fade run-ups and tankages.

These two approaches can be combine in a machine learning fashion (DISCLAIMER: I’m not bragging) to produce follow/don’t-follow signals based upon historical data for that system.

You can average the optimal values obtain via analyzing many system to produce generic values that can be used on a “new” system that has no history to analyze.

Just thinking out loud.

Topic		Replies	Views
The best of Collective2? C2	6	140	December 10, 2007
Older "Stock Only" System Performance Statistics C2	0	194	January 23, 2014
Data mining for fun and interest C2 Explorer	8	678	August 3, 2018
System Status update C2	6	177	September 11, 2009
Why so many systems failed? C2	0	338	April 1, 2012

C2 data mining for fun and profit

Related topics