-
Premium Member
MLB sabermetric picks
Evening all,
This is my attempt at generating a system to beat MLB lines and a brief explanation as to why it is possible.
Inefficient statistical analysis
The FIRST reason is that baseball is a sport which is designed brilliantly for statistical analysis. It is a series of set plays which can be recorded, understood and presented statistically with great ease. Stats have always been a huge part of baseball. Herein lies the 'beatability' of MLB lines.
Historically, the stats which people have considered in assessing how could a baseballer is have been:
For batters - Batting Average (the ratio between 'hits' and 'outs' made by a batter), Home Runs, RBIs (the number of runners a batter is responsible personally for causing to score) and Stolen Bases.
For pitchers - Wins (the number of games in which a pitcher was pitching which resulted in his team winning), ERA (the number of runs given up by a pitcher, per 9 innings) and Strikeouts.
In recent years, though, these stats have come under great critical analysis from a new breed of statisticians who have proved their uselessness in assessing the quality of a baseballer and their predictive value for future success. They call this "sabermetrics". For those interested further, check out fangraphs.com.
New statistics like wOBA, FIP, xFIP have come along, recognising the luck factor associate with many of the traditional stats and seeking to 'strip back' those factors.
Vegas lines are heavily influenced by the starting pitcher, as they should be...it is the single most determinative factor in who wins any individual baseball game. It is, however, an area in which there are often weaknesses in the statistical analysis. ERA and Wins are used, where xFIP is a much better predictive factor.
Randomness
The SECOND reason is that baseball is a sport which is subject to huge variation. To give you an idea, baseball teams play 162 games a year. Each year, the best two or three teams (out of 30) will win 100 games. That means the BEST teams will lose 60 or so games a year; 37% of their games.
The trouble is that people like to find patterns, it's human nature. We decide a team is better than the other, and say they're going to win. But the problem is that baseball is a hugely random game. Underdogs, therefore, become an undervalued proposition. Check out this link MLB Betting 101 | AccuScore for a look at the 2006 season. Simply backing the underdog in every match that season would have seen you +42 units.
Conclusion
I intend to use a few baseball predictor services (Accuscore from ESPN, as well as Fangraphs' own), combine it with some research into potentially under/overvalued Starting Pitchers and find good spots to get some money on.
First picks start tonight, follow me if you fancy it!
-
Premium Member
ERA and Wins are used, where xFIP is a much better predictive factor.
Interesting post but why are the "new" stats better than the old ones?
-
Premium Member
First picks
San Diego v Boston ML +184
New York Y v Cincinnati ML +100
Texas -1.5 v Houston +107
-
Poker...I'll look into this. Right up my street but my first thoughts are "Sabermetrics" which you say can't use the historic facts to predict the future. Isn't that what you are doing by applying the randomness of historic seasons?
I am by no means shouting this down as I too along with another member on here have developed football analysis tools to try and beat the game. I am just hoping that this starts a good debate.
-
Premium Member
PG - excellent idea. I'd be really interested to learn more about the stats side of this too.
For anybody who is interested in the use of stats within baseball, I would highly recommend reading Moneyball. In fact, I would recommend it to anybody who loves sport, it's a fascinating story.
[ame=http://www.amazon.co.uk/Moneyball-Art-Winning-Unfair-Game/dp/0393324818/ref=sr_1_1?ie=UTF8&qid=1308603950&sr=8-1]Moneyball: The Art of Winning an Unfair Game: Amazon.co.uk: Michael Lewis: Books[/ame]
-
Premium Member
 Originally Posted by King_Suckerman
Interesting post but why are the "new" stats better than the old ones?
Basically the old ones are very dependent upon randomness. I don't want to spend all night explaining it (again, fangraphs.com for huge detail), but I'll explain it in relation to the two weakest oft-cited stats.
RBIs (Runs Batted In)
An RBI is where a batter "drives in" a run. That is, if there is a runner on base and he gets a hit he'll be credited with an RBI for each runner who scores as a result of that hit.
The problem with RBIs is that they're very situationally dependent. A great hitter on a poor team will get fewer "RBI opportunities", that is a situation in which they have a high chance of an RBI. They come to the plate with runners on 2nd and 3rd base, for instance. A hit in that situation will likely gain 2 RBIs, whereas without those two runners on base nothing would have been awarded. A better team has more players on base more frequently, thus rewarding its hitters with more RBI opportunities - irrespective of that hitter's actual ability.
ERA - "Earned Run Average"
This is a statistic used to measure the ability of pitchers, and is often quoted. It is calculated by taking the number of "earned" runs a pitcher concedes (that is, runs scoring not as a result of an error) and dividing that by the number of innings pitched. That number is then multipled by 9 (the number of innings in a game). It therefore gives you an idea of, on average, how many runs a pitcher would give away per game.
The big failure of it is that it fails to take into account the huge luck factor involved with what happens to a ball after it is hit by the batter. There is a theory which was developed in recent years called "DIPs" theory. It basically says that all that a pitcher has control over is Strikeouts, Walks (where a 'free pass' to first base is issued as a result of a pitcher throwing 4 balls before 3 strikes) and Home Runs.
The theory goes, then, that some days a pitcher might pitch perfectly well, strike out plenty of hitters, walk very few, give up no home runs but those times that a batter did make contact with the pitch it evaded fielders and caused runs to be scored. This, according to the theory, is total randomness. On another day those balls hit might have gone straight to fielders and resulted in outs.
Basically, for sabermatricians, a bad pitcher is one who strikes out few, walks lots and gives up lots of Home Runs to hitters. A good one does the opposite. In terms of predicting the future, these are the only things you should consider...not the number of runs he has previously conceded.
They have developed various stats to strip away this "luck factor", the most popular of which now is xFIP. Where I see a big divergence between a pitchers ERA and his xFIP I consider him to be either under or overrated on the basis of his season's performance thus far.
Tonight, for instance, Derek Holland's ERA is 4.77. This makes him below league average, and it is my view that this is how he will be regarded by Vegas et al. Dig a little deeper, however, and his xFIP is 3.77 - comfortably better than league average. This, on top of the fact that that ESPN simulator has the Rangers winning the game 69% of the time as opposed to an implied percentage of about 60% at Pinnys, has led me to take on this play.
Hope that is of some help, fire away if you have any more questions.
-
Premium Member
 Originally Posted by mjones
Poker...I'll look into this. Right up my street but my first thoughts are "Sabermetrics" which you say can't use the historic facts to predict the future. Isn't that what you are doing by applying the randomness of historic seasons?
I am by no means shouting this down as I too along with another member on here have developed football analysis tools to try and beat the game. I am just hoping that this starts a good debate.
Sorry, if I came across like that it's not what I intended.
Sabermetrics says the exact opposite, that there are ways of using past performance and translating it into accurate future predictions much better than traditional stats are capable of making such predictions.
-
Premium Member
This is good stuff Pokergod.
Do you intend to use a staking plan ie are some picks stronger than others?
I see you have gone two on ML and one Hcap maybe you can explain the reasoning behind that when you get a chance?
-
Premium Member
 Originally Posted by Diamondgeezer
This is good stuff Pokergod.
Do you intend to use a staking plan ie are some picks stronger than others?
I see you have gone two on ML and one Hcap maybe you can explain the reasoning behind that when you get a chance?
Some picks are stronger than others, I might introduce a staking plan in due course.
The Hcap pick is because Texas are a big favorite in that matchup anyway, I'd rather back them -1.5 to get a bigger price and because Texas are a high scoring team so the theory runs that they win fewer one-run games than average.
-
Premium Member
 Originally Posted by pokergod112
First picks
San Diego v Boston ML +184 Lost
New York Y v Cincinnati ML +100 Lost
Texas -1.5 v Houston +107 Won
Well, a disappointing start.
1-2
-0.93 units (thanks Fraser)
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|