|
|||||||
| Capping All handicapping, betting systems, spreadsheets, mathematics & quantitative technicapping. |
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
#1
|
|||
|
|||
|
Extracting Data/Stats from Statfox for College Football
Some people have been asking on how to build a database or get information (scrape data). Well this post explains one method, often seen as cheating, but nonetheless gets the job done and the programming is easier to absorb.
College Football season is shortly finding its way into our interests once again, the season is only about ten weeks away. For me, this spells chaos since I suck at betting college football. And I’m determined to find a way to get better, or at least be somewhat serviceable. I need to start building a database. How does one do this without going step by step, page by page copy and paste or the excel table import function, taking hours and even days to get all the information you want? Loops. Open excel and record a macro of yourself extracting a web table using the built in function(Data>>Source>>From Web). Once the data is imported, stop the macro and briefly survey the VBA code. You’ll get a good idea how the code is formatted, and the objects/properties involved in processing information by way of macro. Find a nice list of teams from somewhere, preferably Statfox so as to remain consistent with the pages to be included. Fortunately enough Statfox formats their page links with identical parameters surrounding the name of the team. All spaces in the team names are simply replaced with “+”. After having put a list of teams together in excel, replacing any spaces with “+”, name the range whatever, maybe something like “teams”, and stick a FOR…NEXT loop around the VBA code. Then wait. Code:
Sub GetDataFromStatfox()
'
' Grabbing CFB Data from Statfox Macro
'
'
Dim i As Integer
Dim cnstr As String
i = 1
For i = 1 To Range("teams").Rows.Count
cnstr = "URL;http://www.statfox.com/cfb/gamelog~teamid~" & Range("teams").Cells(i, 1).Value & ".htm"
Sheets.Add.Name = Range("teams").Cells(i, 1).Value
With ActiveSheet.QueryTables.Add(Connection:=cnstr, Destination:=Range("$A$1"))
.Name = Range("teams").Cells(i, 1).Value
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = True
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
ActiveWindow.SmallScroll Down:=0
Next i
End Sub
Quote:
Once again you can even take it a step further and write a macro that only finds the rows that contain a particular piece of data. Most games that will be displayed on the statfox page will correspond with a date. The macro elements ‘IsDate’ or ‘InStr(string1, string2, date)’, or ‘IsNumber’ if you just want the line, wrapped around an If…Then statement within the inner FOR…NEXT loop, and whatever information or set of statistics that you want to operate with can be manipulated.
__________________
"Nobody goes there anymore, its too crowded." --Yogi Berra "Always tell the truth, that way you won't have to remember what you said." --Mark Twain *=$50,000 |
|
#2
|
|||
|
|||
|
<tup>
__________________
The only bridge I've ever burned along this legacy I dance is the one that linked the cities of prosperity and chance Check out Technicapping for quantitative sport analysis |
|
#3
|
|||
|
|||
|
Another great contribution from CM's best poster. The most annoying thing about gathering CFB data is the inconsistencies with the naming i.e. Miami, Fl vs. Miami vs. Miami (Fl.) vs. Miami FL
|
|
#4
|
|||
|
|||
|
thanks for the info UVA!
|
|
#5
|
|||
|
|||
|
Include the following code directly after the variable declarations (Dim statements) at the beginning of the code (right before "i = 1")
Application.ScreenUpdating = False Application.Calculation = xlCalculationManual And then set the calculation back to automatic by adding Application.Calculation = xlCalculationAutomatic at the end right before the End Sub statement (directly after "Next i"). Now the code should run a lot faster. |
|
#6
|
|||
|
|||
|
thanks thom, I had the autocalculation toggle in my NBA Playoff simulator, forgot to include here
You seem to be privy to programming language, how deep is ur knowledge because I could use some help
__________________
"Nobody goes there anymore, its too crowded." --Yogi Berra "Always tell the truth, that way you won't have to remember what you said." --Mark Twain *=$50,000 |
|
#7
|
|||
|
|||
|
Happy to help when I can. However, I would say that my programming knowledge is extremely Excel specialized since I use Excel on a daily basis. I have a pretty good knowledge of VBA and have created quite a few Excel workbooks that extract data from files or the internet. As long as I have the full URL to the data source, I think that I should be able to get the data I want from it. These days a lot of web sites use java scripts where the underlying url is not obvious, unless (I assume it would help) you know java script (which I don't) and then I run into trouble.
I have used both web queries where I import the data to Excel before saving it as well as extracting the data from the html source code and storing it to arrays and then saving it e.g. to csv files. Depends on the data layout which method is preferable. So if you have questions regarding Excel and VBA I think I would be able to help. As an example, I have been planning to use Excel/VBA to extract historical odds (and scores) from sites like BetExplorer. I know it is very possible but haven't gotten around to actually programming it. |
|
#8
|
|||
|
|||
|
what kind of data do you want
bet explorer is merely a linking affiliate to oddsportal
__________________
"Nobody goes there anymore, its too crowded." --Yogi Berra "Always tell the truth, that way you won't have to remember what you said." --Mark Twain *=$50,000 Last edited by uva3021; 07-02-2010 at 08:29 AM. |
|
#9
|
|||
|
|||
|
I was/am going to get the historical odds and scores for some of the larger soccer leagues to start off with. I frequently use OddsPortal to compare odds manually but BetExplorer will show you historical odds for what appears to be all bookies without having to be logged in (OddsPortal requires you to be logged in to see all bookies), which helps when downloading/scraping the odds data.
|
|
#10
|
|||
|
|||
|
This is great! Thank you UVA!
:che: |
![]() |
| Bookmarks |
«
Previous Thread
|
Next Thread
»
| Thread Tools | |
| Display Modes | |
|
|
All times are GMT -5. The time now is 05:47 PM.








Linear Mode
