Svetozar Korytiak

Quandl is The New, Better, Yahoo Finance

December 16, 2017

Getting financial data these days is not as hard as it perhaps used to be. If you like to play around with it and see if you can do anything useful with it, you may use what I`m gonna give you here. I`m gonna show you simple way of getting S&P 500 data by scraping wikipedia for the list of symbols, and using Quandl API to fetch the data.

A bit of history

This used to be very easy, we had Yahoo finance and scripts to download the test data to your db were on every corner. Those days are gone, since Yahoo link got broken and it doesn`t seem that it is coming back any time soon. Quandl, however, is great replacement, that is getting better every month it seems. They support Python, which is really easy to use. But not too easy, of course, there are some subtleties of how to get that data. To overcome those I wrote this little script. Hope you find it helpful:

#!/usr/bin/env python3


import time
from sqlalchemy import create_engine
import quandl
import bs4
import requests


"""Imports the list of symbols from wikipedia, and appends 'WIKI/', to
make them readable by Quandl API."""

response = requests.get(
    "http://en.wikipedia.org/wiki/List_of_S%26P_500_companies"
)
soup = bs4.BeautifulSoup(response.text)
symbolslist = soup.select('table')[0].select('tr')[1:]
symbols = []
for i, symbol in enumerate(symbolslist):
    tds = symbol.select('td')
    symbols.append('WIKI/' + tds[0].select('a')[0].text)


engine = create_engine(
    'postgresql://postgres:username@localhost:5432/databasename'
)
quandl.ApiConfig.api_key = "Your Quandl key here!"
for s in symbols:
    s = s.replace('.', '_')
    if s[-1] == '_':
        s = s[:-1]
    while True:
        try:
            daily_data = quandl.get(
                s, collapse="daily", start_date="2000-1-1"
                )
            break
        except:
            print(str(s) + ' attempting again...')
            time.sleep(1)

    daily_data.to_sql(name=s, con=engine, if_exists='replace', index=True)
    print(s + ' done...')

This gets data for each symbol for S&P index from wikipedia using quandl and downloads them into postgresql database. Important to note here, is that you need quandl api key (which is free). Before we send the request, we need to replace . by _ and remove last underscore, to make it readable for quandl API. Sometimes Quandl API crashes or sends the error response, so for it not to stop our whole script, we sometimes need to repeat the requests until we get the data. Be aware, this is not the best solution, it works for me, but you may want to look into it and change up the code in the while loop.