I Hate SQL (Or how my wife finished NFLPool)

I mentioned in my last post (and a couple others) how invaluable my wife has been in my journey to learn Python. That was turned up to 11 last week and I (we) started working on the scoring calculations.

I started to write the scoring calculations exactly as you would expect a newbie coder to – step by step. With the changes to the data model Kelly recommended, I still have a hard time wrapping my head around how we’ve abstracted the pick information. For scoring, the first goal is find the top three players in their position. I made a SQL query using SQLAlchemy which returned with the NFL Player’s player_id and the needed stats – in this case interceptions, which I knew had resulted in a tie that I would have to account for. This returned a tuple: (123456, 1). I then used a list comprehension (yay, me!) to turn that into a list and quickly realized I had to reverse the tuple to sort it by interception and not player ID. Thanks to Stack Overflow, I even found a function that added rank to the list that even accounted for standard competition ranking.

My wife looked at that, shook her head, and told me there had to be a better way and it could all be done in SQL. That started a process that lasted a few hours to figure out how to do SQL directly in the app instead of using SQLAlchemy. After she spent hours coding all the SQL queries, I now have a hundreds of line of code that I can read, but don’t really understand. But it works! As she doesn’t really know Python, I have some clean up to do to conform to PEP8, but I’m not complaining at all.

The other thing I did a terrible job with was naming my database columns – they weren’t consistent. I fixed all those over the weekend, but I also had to change them all in production. Changing the database in production scares the hell out of me, especially as a handful of players have already created their accounts, but thankfully haven’t submitted their picks yet. The first round of changes to the production database went fine and Sunday night I made the second round of changes to accommodate the player picks table so we could correctly do scoring.

And of course things broke. I had just arrived in the office when Rollbar notified me in Slack and via email that the production website had errors. I am so grateful that Michael Kennedy included adding Rollbar support in the Python for Entrepreneurs course. Within ten minutes I had fixed both the fact that the submit picks page requires you to be logged in (instead of displaying an error) – not sure how I forgot to add it to that page – and second I fixed the database table that needed an update to correctly let a player submit their picks. I need to improve my QA skills.

I need to finish the QA testing on the scoring calculations by simulating the 2016 season, but it’s looking good so far. We’re 36 hours from picks locking and the season kicking off and I’m excited.

Importing NFL statistics into NFLPool

NFLPool has been live for almost two weeks – and hasn’t crashed (yet!) After the rush to get the site up and allow a user to make their picks before I left on vacation, there is one more large chunk of work to get to 1.0 release: calculate the score for all players every week of the NFL season.

I spent all of last week in the middle of Minnesota at a friend’s cabin. It was great to get away from both my day job and NFLPool, but I was loathe to ruin my daily coding streak on Github:

I did bring my laptop to the cabin, and the first couple of mornings I did some minor work, mostly starting the process to add docstrings to the code to document the different methods. Being a former documentation contributor to GNOME you would think that I would include documentation in both the code and the project, but no. I now understand why some developers don’t include documentation in the rush of getting something built. If you look closely, you’ll see in the second last column on the right of the screenshot I did stop coding for a few days and enjoyed my vacation.

Kelly’s help was invaluable (again) in creating the data model to store the picks. There are three tables used. In my last post, I had mentioned that Kelly had me re-do the way we store a player’s picks. To do this, we created a “pick type” table. This is a reference table for the different kids of picks the players make:

  • Individual Player Stats (passing leader, rushing leader, etc.)
  • Team Stats (Division winners, 2nd place, last place)
  • Wildcard playoff teams
  • Tiebreaker

class PickTypes(SqlAlchemyBase):
tablename = 'PickTypes'
pick_id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True, autoincrement=True)
name = sqlalchemy.Column(sqlalchemy.String)

When a player makes their picks, it’s stored in the PlayerPicks database table:

class PlayerPicks(SqlAlchemyBase):
tablename = 'PlayerPicks'
pick_id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True, autoincrement=True)
user_id = sqlalchemy.Column(sqlalchemy.String, sqlalchemy.ForeignKey('Account.id'))
season = sqlalchemy.Column(sqlalchemy.Integer, index=True)
date_submitted = sqlalchemy.Column(sqlalchemy.DATETIME)
conf_id = sqlalchemy.Column(sqlalchemy.Integer, sqlalchemy.ForeignKey('ConferenceInfo.conf_id'))
division_id = sqlalchemy.Column(sqlalchemy.Integer, sqlalchemy.ForeignKey('DivisionInfo.division_id'))
rank = sqlalchemy.Column(sqlalchemy.Integer)
team_id = sqlalchemy.Column(sqlalchemy.Integer, sqlalchemy.ForeignKey('TeamInfo.team_id'))
multiplier = sqlalchemy.Column(sqlalchemy.Integer, default=1)
player_id = sqlalchemy.Column(sqlalchemy.Integer, sqlalchemy.ForeignKey('ActiveNFLPlayers.player_id'))
pick_type = sqlalchemy.Column(sqlalchemy.Integer, sqlalchemy.ForeignKey('PickTypes.pick_id'))

There are 41 total picks a player makes stored as 41 rows in the database:

The next part I just couldn’t wrap my head around. Kelly recommended I make a table with just four columns:

class WeeklyPlayerResults(SqlAlchemyBase):
tablename = 'WeeklyPlayerResults'
pick_id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True, autoincrement=True, index=True)
season = sqlalchemy.Column(sqlalchemy.Integer)
week = sqlalchemy.Column(sqlalchemy.Integer)
points_earned = sqlalchemy.Column(sqlalchemy.Integer)

It took me a few minutes, but pick_id in this table contains (embeds?) the user_id```, pick_type, season, and the pick data. It made my head hurt.

The other thing she pointed out to me is I did a poor job implementing the data model with some inconsistent column names across tables. For example, the Account table uses id as the column name to store a user’s unique id – this should have been user_id. In other tables I use a generic id to store a primary key. I spent some time yesterday fixing almost all of these, with the exception of Accounts. The id used in Accounts has too many references in the code already, but where I could I changed id to primary_key and some other changes to make the column names more explicit. I then had to update the database in production, which always makes me extremely nervous. But if I’m going to do it, now is the time when I only have a handful of users and can touch all of the tables outside of Account in production.

But anyway, it was time to start working on the ability import statistics every week of the NFL season from MySportsFeeds. Here is where things get ugly. Calculating a player’s score weekly for 17 weeks over the course of the season is going to take a lot of testing. My goal is to use the data from last year’s players in 2016 to compare the app’s scoring mechanism to what I did by hand. If I do it right, they should match (unless I made mistakes last year). But MySportsFeeds changed the API from version 1.0 in 2016 to 1.1 in 2017 and the URL to pull the stats differs. I had hoped to start a new branch called 2016 in git to do testing, but I then lost a day when I screwed up my branches trying to merge between master, 2016 and pyramid (where I do all of the development work before merging to master, which I treat as production). So I’m doing it all in my development branch, pyramid, which results in a big if / else statement:

if season == 2016:
response = requests.get('https://api.mysportsfeeds.com/v1.1/pull/nfl/2016-2017-'
auth=HTTPBasicAuth(secret.msf_username, secret.msf_pw))
response = requests.get('https://api.mysportsfeeds.com/v1.1/pull/nfl/' + str(season) +
auth=HTTPBasicAuth(secret.msf_username, secret.msf_pw))

After the season is over, I’ll probably remove all 2016 references like this, but for now, I need them.

I had hoped to make one API call to MySportsFeeds in weekly_msf_data.py for all five of the individual NFL player stats (rushing, passing, etc) but to save a little database space, I wrote five methods for each category. That was easy enough: it pulled the right data and saved it to the database.

But doing the team stats (division winners, points for leaders by conference, etc.) was not nearly as easy. The division_team_standings.json from MySportsFeeds is a dictionary containing two lists. The first list is for NFL conference (AFC and NFC) and each conference has a list inside it for the four divisions. There has to be a more Pythonic way of iterating over this than what I did. (Maybe using a generator?) But I just don’t have the Python knowledge yet. I have it working, but I do this four times, once for each division:

while x < len(team_data):
rank = (team_data[x]["teamentry"][1]["rank"])
team_id = (team_data[x]["teamentry"][1]["team"]["ID"])

        x += 1

Using x, y, z, and athis returns 8 results four times, which includes both conferences. Ugly, but it works.

But when I added this to the WeeklyTeamStats table, I was creating 96 rows – the 32 NFL teams three times. I only wanted 32 rows – and it took me three hours of reading through the SQLAlchemy documentation and examples on StackOverflow to figure out how to do an update to the table instead of an insert. That’s one way to learn.

Now that I can store NFL data, next steps include:

  • Add picks for the 8 players from the 2016 season to test results
  • Then add a method to iterate through those picks and figure out which picks a player made are unique and assign it a multiplier of two (for the unique bonus)
  • Lastly, and most importantly, then calculate each player’s score based on the real results. And figure out how to display them.

I have two weeks before the first week of the 2017 NFL season when NFLPool players will want to see their results. No problem. I think.

NFLPool Progress: 8/1/2017

I took the week off from work to see how much progress I could make on NFLPool this week (and to get some stuff done around the house, but really, for NFLPool). I told myself I’d blog my progress every day and here it is Wednesday already.

In my last blog post, I noted how I was going to have to move back to SQLite for the database. Over the weekend I ripped out most of the MongoDB code and started laying the groundwork for SQLite. The first thing I did Monday morning was going back through the SQLAlchemy chapter of Python for Entrepreneurs to learn how to properly set up classes in Pyramid using SQLAlchemy to create all the database tables. Back when I was building the first prototype for NFLPool, I had written a Python program that set up used the sqlite module and created all of the tables and then populated the database with the NFL teams information (Team ID, name of the team, city, etc.) and also populated the Active Players table with a list of all players in the NFL. This was all done using the SQL language – which I do not enjoy at all.

I won’t pretend that I still understand object oriented programming, especially the use of classes in Python. I’ve come a long way and I understand the concept of how a class creates an instance of the object, but putting it into practice is still a challenge. Using the Python for Entrepreneurs course examples, I was able to take my data model, write the SQLAlchemy classes, and get the database created.

One of the nice things about the course is that it does touch on how to create an index in a table and making foreign keys to join two tables together, which I’ll need. If I don’t ever migrate to MongoDB, I may want to just use MySQL, which technically, I believe, doesn’t support foreign keys. So I’m not sure how portable this code will be, but like everything else I’m doing, Future Paul can deal with that.

Monday afternoon was all about re-learning the Pyramid concepts of routes, abstracting the database in the viewmodel, and then writing the code in the controller. That took longer than I thought it would. The concepts are originally taught very early in the course, but then you go a long time without touching that part of the code. After all of the hands-on work, I think I’ve got a handle on it. I do like how Mr. Kennedy has you separate the viewmodel from the controller – he makes a point in the training you don’t have to do it – but if you don’t, I don’t see how troubleshooting your code could be easy.

That night, I took the kids to the community center where they swim while I sit and watch them in the viewing area with my laptop and use the free WiFi. I pulled my code down from Github and then swore in my head for an hour when Pyramid would error out and wouldn’t create the database. Which it just should – when Pyramid starts, it looks and if the database doesn’t exist, it looks at the classes I’ve written in SQLAlchemy in my /data directory and creates the database and those tables from the classes. But it wouldn’t. It wasn’t until I was back home later that night that I had an epiphany, created the /db directory manually where the database should live, re-started Pyramid, and voila, my database was there. Now I don’t have to worry about not being portable and can code both at home and on the go.

Tuesday was all about what happens when you first install and setup NFLPool. Since I’m using SQLite, I want to make it easy for me and other to set it up. The easiest thing to do with SQLite is when you need to make changes to the database during development to just delete it and have the app re-create it. Since I’m constantly developing it, I just wanted it easy to re-create and in the case my code is ever used by anyone (it’s on Github, licensed under the GPLv3), it’s better to do this now than try to add this functionality later. (Why GPLv3 and not MIT? My code is so bad being so new to programming, that I want any changes to the source available so I can look at it and learn. Though I’ll probably make it MIT at some point because no one is ever going to use my code, to be honest. Though I could digress on a business model I have brewing…)

Well, the first step in NFLPool is to set up the tables with the team information, which is static – it’s just the team name, the abbreviation, the city, and the division and conference they play in. So I thought I’d make an admin webpage where the admin goes, presses a button, and that database table is populated. It would then re-direct to the next page to create a new season. You enter “2017” for example, and it would reach out to MySportsFeeds and pull all active players in the NFL. Now you’re ready for the players in NFLPool to make their picks.

Sounds easy, and then the next thing I knew it was 7pm and I was asking my wife for help. Using the original prototype, which had working Python code to populate the team info table and the active players, I couldn’t get the team info to populate. I thought this tweet summed it up pretty well:

Turns out I had a return statement at the end of my loop that shouldn’t be there. Took her five minutes – sometimes you just need another set of eyes. The loop itself is terrible code, but it’s working.

The other problem I had was with the template and styling of the Admin page to create the first table. The Chameleon template engine kept crashing on me in my browser – again, she looked at it, pointed out I was missing two divs to close it, and that was working. Coding is fun, I keep telling myself.

After the table populates, which works now, I still don’t have the re-direct working correctly either. I can’t get it to re-direct to a new page – I’m pretty sure it’s how I have the POST and the routes set up in the admin controller. I can get it to redirect back to the same page though after it populates the first table. I need to keep pushing forward as vacation is almost half over, so there’s another thing on the TODO list for Future Paul.

Today is about doing some code cleanup. This includes adding an admin or settings table in the database to store the current season. From there, I can set up a baseurl function that connects to MySportsFeeds and using the season variable above to create the URL needed to get the JSON data needed for that entire season. This way, at the beginning of each season, you don’t have to go in and manually change all this code. I’m hopeful to get this done this morning, but will probably get stuck, but who knows. From there, I have two major pieces to get done:

  1. Start the code to allow players to make their season picks. From the Active Players table, query the database and get a list of all players in a certain position. Then create a form or Javascript to store those in a drop down box. I have no idea how to make a drop down box.

  2. Account creation / login / authentication. The course has a great chapter on this, I’m not too worried about it.

If I can get these done, then players can create their accounts and start making their picks (though I have to store those picks in the database, but I’m feeling confident). In theory, I could launch next week, which is tight. And then the hard part starts: writing all the code to calculate the scores for each player.

Until then, I’ll just keep telling myself this:

Back to SQLite for NFLPool

I’ve had to throw in the towel on MongoDB and move back to SQLite for the datastore. For now.

I was successful in being able to call an API, take the JSON object from the API, and store that JSON in MongoDB as a collection. But I what really wanted was to store that JSON object as an EmbeddedDocument within a collection.

My original goal was to stick JSON objects into MongoDB and then query against that. I envisioned two collections, one for Users (registration) and the second for the NFL data, called Stats. Stats would have looked like:



—2016 (and one document this fall for 2017, one per year moving forward)

—Player Picks (Each User’s picks for the 2016 season)

–Week (1 – 17, 17 different Documents, each embedded with:)

—Individual Player Stats

—Division Standings, etc

I read dozens of articles on StackOverflow, Reddit, and others. I tried to code it as an EmbeddedDocument, DynamicFields, and more. Everything ended up as an error. I needed to put the JSON object as an embedded document in each Week document above. But I couldn’t figure out.

I’m out of time. I’ve procrastinated enough and I really want to get a prototype and / or minimum viable product up in the next two weeks. There are two things that are a must:

  1. User registration and login system
  2. User pick submission

The first one shouldn’t be a problem as how to build a secure user registration system with passlib is included in the Python for Entrepreneurs course. The second requirements shouldn’t be that hard, but every time I think that, I’m proven wrong.

When I was using SQLite previously, I had all the code written to create the tables and do the initial import of the data needed. I’ll need to re-use some of that code and port the rest from the pure SQL statements I was using to SQLAlchemy.

I was also stuck on what the weekly results table should look like. My wife had some thoughts on it that thoroughly confused me, but when we chatted about it yesterday, neither of us could remember why she was recommending the design she did. I gave it some more thought and I think I have a database model that will work. I’ve written all the class files I need for SQLAlchemy and have taken the week off work to see how much progress I can make.

The big think I need to do is figure out the Javascript within Pyramid and the Chameleon template language to allow a user to make their picks. I have a table that stores all the active players in the NFL with columns for their first name, last name, player ID, team ID and position. I will need to create drop down boxes allowing them to choose, for example, which quarterback they think will lead the league in passing. That part of I have no idea how to do…. yet.

But one thing at a time.

Importing Team Data into NFLPool

Last weekend I discovered how to pretty print the five JSON files I get from MySportsFeeds. This was helpful to understand just how much data is nested within each file. I also spent a good chunk of the weekend writing in a notebook. I mostly did some data modeling on what each table in the database should store and what their primary keys would be. I also captured things I need to research and started breaking the project into chunks. As I tweeted out over the weekend:

Monday was a holiday so I did the first four courses of Python Jumpstart. I took a break and went back to the JSON files I had worked with. My goal was to build with what should be the easiest table and pull the team data out. This is a dictionary that includes the team name (Texans), city (Houston), abbreviation (HOU) and id (64). The ID number is supplied in the JSON feed and is unique, so I will use that as the primary key. There will be two more columns in the table for conference and division, but I wanted to deal with that later.

I wrote a for loop to try and pull out each team’s information. I quickly got stuck and nothing was working. At one point, the loop I had written worked, but only pulled out the data for the first ranked team. I showed my wife my code and she pointed out that it wasn’t iterating in a loop.

I was stuck for two nights working on this after dinner. I finally stepped back and modified my pretty print Python program and started breaking down all of the information in the JSON file again. I figured out what was a list and what was a dictionary and what was nested where. (It looks like I didn’t commit this to the git repo, oops! Will have to fix that.)

After doing this last night, I found the list I needed to work with. I then re-wrote my for loop and I was able to iterate through all 16 teams in the AFC:

for afc_team_list in teamlist:

afc_team_name = data["conferenceteamstandings"]["conference"][0]["teamentry"][x]["team"]["Name"]

afc_team_city = data["conferenceteamstandings"]["conference"][0]["teamentry"][x]["team"]["City"]

afc_team_id = data["conferenceteamstandings"]["conference"][0]["teamentry"][x]["team"]["ID"]

afc_team_abbr = data["conferenceteamstandings"]["conference"][0]["teamentry"][x]["team"]["Abbreviation"]


x = x + 1

I then copied and pasted and did it again for the NFC. I did try, unsuccessfully, to modify the conference list – “conference”0 – so I could just write one for loop instead of one for each of the two conferences. But it was working, so I’ll leave it for now. (I’m sure my code is ugly, but hey, I’m just starting).

After that it was all about writing the SQL insert statements to put this into a SQLite3 database. (For now, later it will go into MySQL). That took me a an hour, but at the end, I got it working and was even able to add the conference name to each row.

Next up, I need to take the data in the Division standings JSON file. In it is stored the division name for each division in a conference: AFC/AFC-East. I’ll need to write a for loop to grab it, slice it to remove the “AFC/“ and then stick that in the Division field for each team in the Teams table. I’ll also need to stop dropping and re-creating the table each time I insert data, but it’s working.