Realism Factor on limit systems

This is for Matthew, (or anyone else that may want to chime in). I note that the RF for Smartrade.it is 99.6, and also note that they trade frequently each day using limit orders. Not singling them out, as there are other systems using limit orders that have a 100% RF. I, along with many others who have posted here, have reported that the real fills on limit order systems are usually worse, sometimes much worse, than reported. I have used the setting to convert to market order within one second, and still don’t get filled at what is reported. I understand the RF is an estimated number, but how can it be anywhere close to 99.6 for Smarttrade.it, or 100% on other systems that employ limit orders? I think it’s deceptive to potential subscribers, who only find out too late that they have lost money in the REAL account, whereas the posted results look fantastic.

Michael:



You need to keep in mind that what matters is not that an order is a limit order, but that it is a “just touched” limit order. Here’s an example:



IBM is trading at 100. System says BUY at LIMIT 98. The price then goes down to 97. Notice that it has traded through the limit price, and has completely cleared the market at that price. In this case, the limit/stop component of the Realism Factor (RF) is 100.



Here’s a second example. IBM trades at 100. System says BUY LIMIT 98. Price goes down to 98, for just a moment or two, then goes back up. In this case, not all orders to buy at limit 98 will be filled, so Collective2 assigns an RF that is less than 100%. (The number is based on the volume that does trade at that just-touched price).



Matthew

MK … do you also always account for the bid/ask spread (ie. the just-touched price must be the ask price on a buy order and the bid price on a sell order)?

In the case of calculating Realism Factors, the system does not look at the spread. It only looks at the actual trade prices.



In the case of live market simulations, C2’s Hypothetical Fill Engine does look at the bid/ask spread when determining if orders should be filled.

In theory, real life limits, stop, and MITs are only executed if it trades

at price. Using bid/ask data would make RF less real.

Matthew,



I do understand what you’re saying, but there has to be a better way to calculate the RF. If i and the others who post here regarding their fills on limit systems continually report fills that represent only 70 to 80% of the RF, something just isn’t right somewhere. The impression a RF gives is that a RF of 99.6 means that a subscribers fills will be about 99.6% of what is shown on the daily trades on an overall average. While I personally may not have gotten that fill, 99.6% of others would have. What I’m saying is that is not the case with limit systems. Either that, or my fills are always worse than every one elses, or that I have the worst luck in the world. Based on some other postings, though, it seems that many others rarely achieve fills close to the posted RF percentage, even when using the proper settings of converting to market within 1 second.

Michael:



You’re focusing on one system’s RF number. Keep in mind that if anyone autotrades a particular system, then that real-life autotrading data does indeed get put into the RF calculation. I don’t know off-hand if anyone is autotrading the system you question. If no one is trading it in a real-life account through C2, then we look at volume and price to determine RF. Let me repeat my earlier statement that if a limit order “trades through” the limit price, then that order would very likely be filled (volume constraints aside) in real life. MK

I have to agree with Michael on this one. It is amazing how often a system vendor will exactly pick the high or low of the day (or a run) with a limit order and cause the most common just-touched fill problem.



A perfect example is a bond system I was trading for four trades through today. One losing trade followed C2’s fills exactly, but the other three trades all had these just-touched events and I realized about 70% of the C2 result on two of them, and 0% on the third one today.



Here is what happened. At the 2:15pm EST Fed decision the bond market went wild for several seconds and the STC limit of the system was just hit for a millisecond or so (at least it seemed no longer than that as I watched the quote feed). C2 fills the order but I wasn’t filled in my real account, and it was nothing to do with TB as the order had been live in the account for a day or two. C2 shows a good profit on the trade but I had to get out manually at a much worse price once I realized I was not filled, and the trade resulted in $0 profit (better than a loss, but…) in my real account compared to $5.9K in the C2 P/L table. I will pay for the “profitable trade” in this case but it is clear that after only 4 trades this particular system cannot be reliably autotraded with C2/TB/IB due to the limit order problem.



The 70% real account RF for limit order systems seems about right to me, regardless of what the system page shows or how the number is calculated.

Matthew,



I think the system I mentioned is a perfect example, Smartrades.it. Today, they had several trades, some of which made a half point. I venture to say that a subscriber would have lost at least a tick on either side, and come out with no profit at all. This happened to me on Coin Collector, and also on other limit systems that trade a lot. I think Randy’s summation of 70% of what is posted is probably closer to real life trading, and probably even worse, much worse, on scalping type systems. I’d like to see a RF factor based on it trading through by one tick before considering the subscriber as having been filled. Perhaps this information is proprietary…but I would be interested in knowing who (the number, not the names) subscribed to this system and what their real fills were. Based on your response, you include this in the RF, so I have to imagine that almost anyone that subscribed to this system is making almost exactly what is posted on C2, which is 99.6%. If that’s the case, sign me up immediately for Smartrade.it

Michael … what’s strange is if you compare the Smartrade.it equity curve plot with RF included vs. the plot for Best Case you do get a result that is in the 70% range (ie. 33,800/45,500 = 74.3%.

Whoops … that should have been Best Case vs. RF with commission … using RF only is much better than 74.3%.

Let me state again that if a price trades through a limit order, you will get filled in real life. For example, if you say buy TBonds at limit 100, and the price hits 99, everyone will get filled.



The tricky part is what happens when a limit order is just touched.



C2 does penalize systems for just-touched limit orders. If you are saying we don’t penalize them enough, then that’s a fair argument to make. But you can’t say that we should penalize orders simply because they are limit orders. Again, if the price trades through the limit, then it’s sure to be filled.



In the case of SmartTrades, I haven’t examined the system, and so don’t know if the system’s limit orders were just-touched or not.



Perhaps I can investigate a new feature that highlights trades that are just-touched limit orders, perhaps by placing a red star next to each order that was just touched. That’s a bit tricky to implement, but it might be a worthwhile feature.

Matthew



This might be a more worthwhile feature to list what percentage of orders just touched the limit in relation to total orders. This will give subscribers a better idea of what to expect from a system.



- Fanus

Matthew, Randy and Fanus…all good posts and/or suggestions. What my initial post demonstartes is that I, as a potential subscriber, will not subscribe to a limit system that trades frequently because I have no way of really knowing how much I could potentially make (or lose) based on the information provided, and past experience. That costs C2 money, as well as the system providers because of leary potential subscribers such as myself who are hesitant to sign up. The system may in fact be a dynamite one, and perhaps people really are getting 99.6% of what’s posted. If that’s so, then I’ve lost because I sit on the sidelines wondering how accurate the RF is, and C2 and the vendor don’t get my commissions.

It seems you are experiencing in practice what I tried to point out in a recent thread in theory: If you calculate the overall RF by averaging over the per-trade RF’s, systems that trade more frequently will have an upward-biased RF, compared to systems that trade less frequently–unless slippage is 0. To clarify the problem I post my previous example again:



Take an arbitrary instrument that gains 20% in 10 days. Suppose that the loss due to slippage per trade is high, ~2% on average on both sides–each trade (ordered at market) would result in a low but similar realism factor. If I would buy and hold the instrument for the 10 days, net profit would be ~16%. If I would buy the stock at the beginning, and subsequently sell it and buy it back immediately for 4 times during those 10 days, and then sell it at the end of the 10 days, net profit would be < 0%.



In both scenarios the overall realism factor and unadjusted profit would be the same, even though in the first system I would come much closer to the theoretical profit of 20%.

> Perhaps I can investigate a new feature …



You don’t need it. Just have the RF reflect

"real-life slippage". Your own data already shows

an “actual” RF of 68-73% depending on the data

you choose:



Realism Factor 99.6



Cumu $ $45,500

after typical commission $37,125

and real-life slippage $33,800



P/L per unit $31.49

after typical commission $28.99

after real-life slippage $21.49

"Let me state again that if a price trades through a limit order, you will get filled in real life."



Could it be that the situation is somewhat more complicated than this? Last week, in that other thread, I gave an example where I got the signal BTO @ 77.2, then TB sent it to MBT as BTO @ 77.1, then C2 reported a fill and my order was converted to a market order which was filled at 77.15. However, C2 reports a fill of 77.08. This would qualify as “through” my limit of 77.1, isn’t it? So according to what you say my limit order at 77.1 should have been filled, but it wasn’t. I can imagine that the broker plays a role in this, or that it is caused by the fact that my order was an odd lot. But then this can play a role in many other orders of other people too. So I doubt that it would be enough to distinguish “touched” from “through”.



In this example of me there was probably no problem for the RF because this was extreme-os, and I suppose that extreme-os falls under case “A” (sufficient slippage data from autotraders) of what you described earlier this week. The suggestion to distinguish “touched” and “through” is only relevant in a case B system, but there something like my example can happen as well.



My impression is that the other problems that have been described above are also typical for case B systems. That is, in case A you have enough slippage data and I’m willing to assume that you process them in a logical way. But in case B you have insufficient data and then, generally speaking, there is not a single “logical” solution. So I suggest that you explain more about your method in case B and convince us that this is indeed a sound method to estimate slippage, or change it such that it corresponds better to experience, or give it another name for case B systems such that we won’t confuse it with slippage, or delete it altogether for case B systems as a way to tell that there are insufficient data.



In particular, I don’t understand how slippage can be estimated from volume. You don’t know the delay of the subscriber and you don’t know their joint volume (unless you would estimate both from autotraders, but that would bring you in case A).



Would it be a good idea to put more systems in case A, even if there is only one autotrader, and assign some warning sign to the RF if you think that there are insufficient data? Even the results of one autotrader tell infinitely more than null autotraders. Then a system vendor who reallly wants to communicate the slippage has the possibility to set up an autotrade account for his own system, perhaps with only 1 unit per trade, which would bring him in case A. Of course, this makes sense only if the subscribers can somehow see whether they are looking to a case A or a case B RF.



Jules

> Let me state again that if a price trades through a limit order, you will get filled in real life. For example, if you say buy TBonds at limit 100, and the price hits 99, everyone will get filled.



While this is true in theory, the system in question is often in and out

of 30 contracts a pop, with LMTs on both sides, in ONE minute or less. Indeed, most of the trades are three minutes or less. It’s not just a frequency issue, but a pure speed issue: delays of several seconds can

mean the LMT was never there to “trade though” in the customer account while it was there in C2.



Also, lets say he’s got 10 subscribers. He is trading 30 lots. Sure

the ES has great liquidity, but turning over 300 contracts (10 x 30) in a matter of seconds is just not always going to happen. Check IB for size @ price, or time and sales: it often dips under 300.



At minimum the RF should reflect C2’s real data. In this case the RF is 99.6, while C2’s actual “P/L after real-life slippage” is 67%. No offense, but these numbers should line up better. How can they both be right and be trusted?

quite true. Again brings to mind the system that traded 400,000 lots on a trade (not including his subscribers!!) on a 500,000 volume day.



If somehow C2 could account for the number of subscribers (and assume x% of them were likely to follow the trade), you might have something there.

You are correct that the slippage an RF should align. I will make them do so. For now, the real-life slippage number is definitely the most important and reliable.