Menu Note


Use the menu below if you'd like to search for posts that relate to your interests. Note - this was just created on 12-30-20 so I will need to link the posts in the coming weeks. Until then, you can scroll down to the labels on the right to find the same information.


Sunday, April 5, 2015

Linking Kellogg's Players to Lahman Database

I added a Kellogg's table to the Sean Lahman database (http://seanlahman.com/) that I've used in the past.  It's a great database that I recommend for anyone who is interested in baseball and learning structured query language (SQL).  I use Microsoft Access and SequelPro to analyze the baseball data.

How did I add a link to the Kellogg's cards?  I created a spreadsheet that contained every checklist from Kellogg's.  Then I added the 9-digit playerid that matches Lahman's playerid.  I did this automatically in Excel by using the left and concatenate functions.

The playerid that I generated wasn't going to be 100% accurate since the names on the existing database and the names Kellogg's used won't always match.  I ran a query to see which playerids that I created were not found on the database.  For those playerids I edited my spreadsheet to match the Lahman database.

For some players I had to edit the playerid to distinguish them from others with similar names or to match the first name used on the Lahman database.  I was surprised to find out that there were two players named Ross Grimsley.  Some players have first names on the database that surprised me.  Jim Hunter is listed as Catfish and Joe Jackson is listed as Shoeless Joe.

Now I can link the players to their statistics, demographics, teams, etc.  This will help me with analyzing player selected and not selected in the Kellogg's sets of the past.

This would be great to do for Topps cards but that would be a big undertaking.  I may contact Sean to see if this already exists.




No comments:

Post a Comment