Anxious about future model improvements

Hi Will/Matt/fellow DG subscribers,

As everyone knows, DataGolf is now having a massive impact on betting lines, especially in top 10/20 markets, where these days it’s rare to find more than one or two bets per tournament (i.e. more than 10% EV, which isn’t too tricky when the odds are so long, it’s basically a 9/1 shot in the model being available at 10/1). To me, it seems that the lack of value is a result of bookmakers and odds compilers subscribing to DG and adjusting their numbers, rather than DG subscribers cannibalising any stray value, as it’s not like the prices are being smashed in, it’s more that they are opening up at odds that don’t offer value in the first place.

Let me get to my point now. At the moment, DG is a really useful tool for me, because there are a small number of things that the DG model doesn’t do, and so I can adjust your numbers slightly, use them as a kind of baseline, and have some good bets against unsuspecting bookmakers. With my adjustments, I’m not talking about subjective, feely things like “oooh, maybe Leishman deserves a bump because he was in a slump and now he’s not”, I’m talking about concrete, more statistically sound things, so many of which you’ve built in already (e.g. strokes gained, course fit, amateur data, hole outs etc). I’m not going to reel off the remaining ones because I want to protect my edge but as you well know there is some sexy stuff you can do with shot data, and I wince every time I hear you mention that on a podcast! Anyway, if you guys were to build more things like that into the model/the predictions, these edges I’ve got would disappear, and to be honest the rest of the betting subscribership wouldn’t really benefit either, as the bookmakers would leech it up in no time and adjust their odds. So at the end of the day, nobody really wins, and if we got to a stage where your numbers were essentially perfect (as in there was no tiny gram of golf data unaccounted for in your numbers), and the bookmakers were all ripping it off for $15/mo, then I’m not sure what DG would be offering to a bettor, as although the predictions themselves would be incredibly accurate, betting is all about one’s relative advantage, and this would be next to zero.

So yeah, I’m not sure what road you want to go down, but as a bettor I’d be keen to see the model kept reasonably sparse. In the past, adding things into the model did benefit us subscribers (both in terms of improving our bets but also teaching us about good process), but times have changed there, and with the books copying you, I fear any improvements are now more likely to destroy value for subscribers than create it.

Be interested to hear what other people think here … I’m just one subscriber at the end of the day. Cheers.

P.S. please follow me on Twitter, @Mad_Satirist

Thanks for this post. We don’t really know how bookmakers are doing it, because as you said even the outright markets are pretty in line at opening now, even though we don’t post the week’s probabilities or related pages until after most books have posted. I think some books are just using our baseline skill numbers from the rankings pages. It also varies week-to-week with how similar the prices are, with the big course fit weeks being more different.

Despite this, we are still making ~100 bets per week, albeit mostly in markets apart from the finish / outrights. I still think the site provides a lot of value to most subscribers (even ones who don’t think they can improve the model). Would be interested to hear whether people are still able to find value.


So this is something I have been thinking about for a bit as well…the fact that DG heavily influences the betting markets. I’m curious to what others thoughts are as it relates to the future. I see the following as potential outcomes, as it relates to DG model odds vs the market:

-We continue as we have been where there are +EV differences between DG and the books, and if you are quick, you can get +EV bets in, but if you are slow the books are likely to close any gaps between DG and them

-Books identify bettors as “dg users” and limit them based on some threshold (maybe this is already happening?)

-Books incorporate dg odds into their opening lines and the edge between DG / sportsbooks is severely cut back

Personally I think the books will stick with option 1, kind of the status quo for a while and just be reactive with their odds. I think the books want to keep users engaged and want to keep reporting strong user count #’s and strong handle #’s with the street so focused on these numbers

I personally don’t see the books copying datagolf, there are a number of quite big differences even at closing lines.
I’m certainly not an expert but from my limited experience on golf markets a number of models seem to drive the books prices from opening to closing lines, there are correlations between some of the players datagolf find as value and other models do as well - and these golfers can move in price quite a bit.

And from personal experience the books identify anyone as a smart user and restrict them. This is a little harder to identify in golf as variance is so big, but if you take the really early prices and beat the closing lines that will flag an account (depending what other bets you place)

I think restrictions and stake limiting will depend on where you are located. For those in mature markets like UK and Ireland, the book is focused on profitability and will aggressively manage those who are winning consistently. Whereas those in growing markets, profitability takes a back seat to market share and turnover. Just look at the difference between signup bonuses in the UK vs. Newly regulated US states. It’s where the UK was 15 years ago trying to grow online market share.

Hi Matt. Thanks a lot for your reply, and yeah DG definitely provides value, if it didn’t I wouldn’t still be a subscriber!!

Going forward then, do you anticipate investing lots of time and effort in improving the main model/predictions? Or going more down the custom model route where people can tweak the base predictions to their liking?

And on the shot data, the Spieth viz was super cool … is it more of that kind of stuff you’re planning for that? Or actually using the shot data to refine your estimates of golfing ability?

@betman I think it depends whether you are talking about matchup markets or weeklong outright / finish position markets. In the latter, bookmakers have definitely taken to copying our numbers recently.

In the former, there are still a lot of differences between our model and the market. Because we release our numbers early we affect the early price movements a lot at the responsive bookmakers like Pinnacle and BetOnline. However, by the time closing rolls around, I would not be that confident betting our “edges” at Pinnacle or any other responsive book as its very likely that any useful information from our odds has been baked into those lines. For example, if Pinnacle opens at 50% for some matchup and we have the odds at 60%, if they initially move to somewhere near 60% (perhaps in response to people betting our number, or from other bettors using models that also indicate value on this bet), but then later they move back to 55% in response to more bets flowing in from other models (which would then again make it a perceived +EV bet according to our number), I would not necessarily bet on that. In that case you are basically betting that our model is perfect and that nobody else can add value to it (which I don’t think is the case). That’s not to say that the models moving lines later are better than ours, it’s simply the case that if you reveal your model’s information early you will be at a disadvantage later on.

All that said, I think the jury is still out on whether betting at closing is profitable with our model. We still need more odds data. At the unresponsive books (e.g. MGM, Fanduel, bet365) it probably still is given that betting at opening is profitable, but at Pinnacle it might require a large discrepancy (e.g. +6-8% or more, which is rare to find).

We have a longish blog on this if anyone is interested, going into the details on why it’s harder to be an oddsmaker (in the true sense) than an odds-taker.

Right, but I was also saying that it is valuable even for people who are just mainly using the model as is. (As an aside, I think it’s very difficult to make confident statements about the source of one’s edge in golf – unless you are using closing line movement, you need an inordinate amount of data to identify a 2% edge in matchups, for example, and even more in outright markets with longer odds. For example, something like course fit will show it’s (very slight) advantage over a no-course fit model over a sample of 1000s of bets, not 10s of bets. There are no massive edges just sitting out there in golf, in my opinion. But, if you have backtested your theories with a few years of data than feel free to ignore my words.)

As for future direction, right now we have a lot of projects that are not focused directly on improving the model, but could provide data for users to improve their own processes or models (e.g. the live tournament stats page). And then we also have the custom model page that will hopefully continue to see improvements and added functionality.

I started typing this last week just before tee time, but didn’t finish. I see it has saved as a draft so posting without complete data as I see quite a bit of discussion continuing about who leads the market pricing-

Might be a bit of US v UK perspectives here, but I do see differences on the outright. Some of this
will also be down to how you classed books eg unresponsive (aka soft books are the UK supplier) whereas we don’t get pinnacle but it is sharp and is generally inline with our betting exchanges I think.

If of any interest I’ll mention some of the differences using European odds for datagolf (DG), UK average bookmaker price (BK) and betfair exchange price (BFEX). As I type being close to tee time the exchange price should be reasonably accurate with wisdom of crowds theory.

  1. top of betting market -
    Rahm (17.5DG, 10BK, 16BFEX)
    Spieth (30.8DG, 16BK, 20BFEX)
    Mcilroy (38DG, 16BK, 25BFEX)
    Morikawa (22DG, 16BK, 20BFEX)
    So some players are not really rated by datagolf but are elsewhere. There could be an element of big name bias there, but still wouldnt expect the exchange to be that different.

  2. Players still showing as value datagolf
    Conners (28.9 DG, 30BK, 40BFEX)
    Streelman (47.6 DG
    Cink (93 DG
    Can still take ‘value’ at sportsbooks/exchange right up to the off, so the books didn’t reduce price below datagolf. Some of these players did see smart money and shortening of prices though so I’d say datagolf was right that the sportsbooks had them overpriced, but the books didnt immediately adjust based on datagolf

  3. Players that are logically value that datagolf doesn’t like
    Villegas (889 DG
    Woodland (125 DG
    Dufner (488 DG
    These players are considerably shorter at the exchange and sportsbooks. There did seem to be smart money for them as well with price shortening at exchange, so all that was independent of datagolfs view.