How To Retrieve ETF Data Using Python

Introduction
The main problem of an investor is identifying profitable trading strategies. To solve this, he uses fundamental and quantitative techniques to pick winning trading systems. The next problem concerning the investor is how to optimally allocate his capital to different winning strategies. A solution is to use the Kelly formula to calculate the amount of capital he will deploy to securities or trading systems before him.
In this article, I will source Vanguard ETFs list from the their website using Python package Selenium. I will use the list to pull historical Open-High-Low-Close-Volume (OHLCV) data using Python package pandas-datareader. The ETF data collected in this article will be used in the next article to present a solution to the investor’s capital allocation problem using the Kelly formula.
Exchange Traded Fund (ETF)
ETFs are traded on major stock exchanges like the New York Stock Exchange and Nasdaq. You can buy and sell them using online brokers such as TD Ameritrade or Interactive Brokers. An ETF is a composed of tens, hundreds or sometimes even thousands of stocks (or bonds) in a single fund.
They have grown in popularity due to the following:
- Low costs: ETFs have an average expense ratio of 0.27% per year.
- Diversification: They provide a diversified basket of stocks instead of buying a single stock.
- Alternative themes: ETFs enable investors to gain exposure into niche industries like water and solar tech firms.
Vanguard
In this article, I will be using ETFs managed by Vanguard and present a brief overview of the firm. It was established in 1975, and offers both actively and passively managed funds. Vanguard’s founder is Jack Bogle, the father of index investing. As of January 2019, it has USD 5.2 trillion of AUM.
Vanguard ETFs
Vanguard ETFs are grouped by:
- US Bond ETFs
- US Stock ETFs
- International Bond ETFs
- International Stock ETFs
- Sector ETFs
Step By Step
- Source Vanguard ETFs list of tickers.
- Source Vanguard ETFs historical OHLCV data.
You can find the code on https://github.com/DinodC/investing-etf-kelly.
Source Vanguard ETFs List Of Tickers
In this section, we retrieve tickers of every ETF on Vanguard’s website. We use Selenium, a web browser automation tool, to help us retrieve the ETF tickers. We propose two approaches, direct and indirect, of sourcing ETF ticker data from website.
Selenium
Selenium automates web browsers. Selenium was conceived for automation of web application testing. It allows you to open a browser and automate actions like clicking buttons and completing forms.
I propose two ways of using Selenium scrape of ETF ticker data:
- An automated approach: We automate navigation of Vanguard’s web site by going from the home page to the ETF list page containing the tickers.
- A non-automated approach: We go directly to the ETF list page containing the tickers.
Import packages
import pandas as pd
from pandas import Series, DataFrame
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
import pandas_datareader as web
import pickle
import matplotlib.pyplot as plt
%matplotlib inline
Automated Approrach
Code for scraping ETF tickers from Vanguard’s home page using the first approach
# Option
option = webdriver.ChromeOptions()
option.add_argument('--incognito')
# Create a Chrome webdriver
driver = webdriver.Chrome('/Applications/chromedriver', options=option)
# Get to the Vanguard home page
driver.get('https://investor.vanguard.com/home/')
# Set delay in seconds
delay = 10
# Head to the next page by clicking on 'Investing'
try:
WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.LINK_TEXT, 'Investing')))
driver.find_element_by_link_text('Investing').click()
except TimeoutException:
print("Timed out waiting for page to load.")
driver.quit()
# Head to the next page by clicking on 'Vanguard ETFs'
try:
WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.LINK_TEXT, 'Vanguard ETFs')))
driver.find_element_by_link_text('Vanguard ETFs').click()
except TimeoutException:
print("Timed out waiting for page to load.")
driver.quit()
# Head to the next page by clicking on 'Browse a list of Vanguard ETFs'
try:
WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.LINK_TEXT, 'Browse a list of Vanguard ETFs')))
driver.find_element_by_partial_link_text('Browse a list of Vanguard ETFs').click()
except TimeoutException:
print("Timed out waiting for page to load.")
driver.quit()
# Source the Vanguard ETF names and tickers
try:
WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.CLASS_NAME, 'productEntry')))
content = driver.find_elements_by_class_name('productEntry')
except TimeoutException:
print("Timed out waiting for page to load.")
driver.quit()
Non-automated Approach
Code for scraping ETF tickers from Vanguard’s ETF list page using the second approach
# Option
option = webdriver.ChromeOptions()
option.add_argument('--incognito')
# Create a Chrome webdriver
driver = webdriver.Chrome('/Applications/chromedriver', options=option)
# Get to the Vanguard ETF list page
driver.get('https://investor.vanguard.com/etf/list#/etf/asset-class/month-end-returns')
# Set delay in seconds
delay = 10
try:
WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.CLASS_NAME, 'productEntry')))
# Source the Vanguard ETF names and tickers
content = driver.find_elements_by_class_name('productEntry')
except TimeoutException:
print("Timed out waiting for page to load.")
driver.quit()
Vanguard ETFs
Inspect the content collected
len(content)
118
for i in content:
print(i.text)
Extended Duration Treasury ETF
Vanguard Select Funds
Intermediate-Term Bond ETF
Intermediate-Term Treasury ETF
Vanguard Select Funds
Long-Term Bond ETF
Long-Term Treasury ETF
Mortgage-Backed Securities ETF
Vanguard Select Funds
Short-Term Bond ETF
Vanguard Select Funds
Short-Term Inflation-Protected Securities ETF
Short-Term Treasury ETF
Vanguard Select Funds
Total Bond Market ETF
Intermediate-Term Corporate Bond ETF
Long-Term Corporate Bond ETF
Short-Term Corporate Bond ETF
Total Corporate Bond ETF
Vanguard Select Funds
Tax-Exempt Bond ETF
Dividend Appreciation ETF
ESG U.S. Stock ETF
Growth ETF
High Dividend Yield ETF
Large-Cap ETF
Mega Cap ETF
Mega Cap Growth ETF
Mega Cap Value ETF
Vanguard Select Funds
S&P 500 ETF
Vanguard Select Funds
Total Stock Market ETF
Value ETF
Vanguard Select Funds
Extended Market ETF
Mid-Cap ETF
Mid-Cap Growth ETF
Mid-Cap Value ETF
Small-Cap ETF
Small-Cap Growth ETF
Small-Cap Value ETF
Total World Bond ETF
Vanguard Select Funds
Total International Bond ETF
Emerging Markets Government Bond ETF
Total World Stock ETF
ESG International Stock ETF
FTSE All-World ex-US ETF
FTSE All-World ex-US Small-Cap ETF
Vanguard Select Funds
FTSE Developed Markets ETF
FTSE Europe ETF
FTSE Pacific ETF
Global ex-U.S. Real Estate ETF
International Dividend Appreciation ETF
International High Dividend Yield ETF
Vanguard Select Funds
Total International Stock ETF
Vanguard Select Funds
FTSE Emerging Markets ETF
Communication Services ETF
Consumer Discretionary ETF
Consumer Staples ETF
Energy ETF
Financials ETF
Health Care ETF
Industrials ETF
Information Technology ETF
Materials ETF
Real Estate ETF
Utilities ETF
EDV Bond - Long-term Government 0.07% $126.45 $0.43
increased
0.34%
increased
2.64%A
30 day
6/13/2019 12.29% 14.61% 7.58% 8.82% 8.04% (12/06/2007)
BIV Bond - Inter-term Investment 0.07% $85.76 $0.01
increased
0.01%
increased
2.67%A
30 day
6/13/2019 6.80% 8.34% 3.13% 4.97% 5.07% (04/03/2007)
VGIT Bond - Inter-term Government 0.07% $65.62 -$0.04
decreased
-0.06%
decreased
2.06%A
30 day
6/13/2019 4.40% 6.75% 2.21% — 2.95% (11/19/2009)
BLV Bond - Long-term Treasury 0.07% $95.15 $0.12
increased
0.12%
increased
3.53%A
30 day
6/13/2019 10.52% 10.28% 5.13% 7.55% 6.86% (04/03/2007)
VGLT Bond - Long-term Government 0.07% $80.92 $0.17
increased
0.20%
increased
2.60%A
30 day
6/13/2019 9.57% 11.83% 5.40% — 6.32% (11/19/2009)
VMBS Bond - Inter-term Government 0.07% $52.55 -$0.01
decreased
-0.02%
decreased
2.83%J
30 day
6/13/2019 3.34% 5.16% 2.29% — 2.80% (11/19/2009)
BSV Bond - Short-term Investment 0.07% $80.25 -$0.04
decreased
-0.05%
decreased
2.26%A
30 day
6/13/2019 3.11% 4.59% 1.58% 2.02% 2.77% (04/03/2007)
VTIP Bond - Short-term Government 0.06% $48.99 -$0.24
decreased
-0.49%
decreased
0.44%E,G
30 day
6/13/2019 2.59% 2.76% 0.79% — 0.66% (10/12/2012)
VGSH Bond - Short-term Government 0.07% $60.78 -$0.02
decreased
-0.03%
decreased
2.08%A
30 day
6/13/2019 2.11% 3.45% 1.05% — 0.99% (11/19/2009)
BND Bond - Inter-term Investment 0.035% $82.39 $0.00
0.00%
2.69%A
30 day
6/13/2019 5.26% 6.56% 2.65% 3.68% 4.08% (04/03/2007)
VCIT Bond - Inter-term Investment 0.07% $88.55 -$0.02
decreased
-0.02%
decreased
3.41%A
30 day
6/13/2019 8.50% 8.76% 3.73% — 5.46% (11/19/2009)
VCLT Bond - Long-term Investment 0.07% $94.06 $0.11
increased
0.12%
increased
4.26%A
30 day
6/13/2019 12.35% 9.65% 5.06% — 7.16% (11/19/2009)
VCSH Bond - Short-term Investment 0.07% $80.25 $0.01
increased
0.01%
increased
2.80%A
30 day
6/13/2019 4.20% 5.28% 2.21% — 2.98% (11/19/2009)
VTC Bond - Inter-term Investment 0.07% $85.27 $0.04
increased
0.05%
increased
3.25%B
30 day
6/13/2019 8.17% 7.59% — — 3.22% (11/07/2017)
VTEB Bond - Long-term National Muni 0.08% $52.93 $0.03
increased
0.05%
increased
2.01%A
30 day
6/13/2019 4.70% 6.31% — — 3.51% (08/21/2015)
VIG Stock - Large-Cap Blend 0.06% $113.47 -$0.92
decreased
-0.80%
decreased
1.97%B
30 day
5/31/2019 16.96% 8.66% 9.32% 12.97% 8.29% (04/21/2006)
ESGV Stock - Large-Cap Blend 0.12% $50.43 $0.04
increased
0.07%
increased
1.74%B
30 day
5/31/2019 17.65% — — — -3.52% (09/18/2018)
VUG Stock - Large-Cap Growth 0.04% $161.29 $0.53
increased
0.33%
increased
1.23%B
30 day
5/31/2019 20.53% 4.63% 11.06% 15.07% 8.81% (01/26/2004)
VYM Stock - Large-Cap Value 0.06% $85.97 -$0.85
decreased
-0.97%
decreased
3.50%B
30 day
5/31/2019 11.98% 1.63% 8.03% 13.36% 7.28% (11/10/2006)
VV Stock - Large-Cap Blend 0.04% $132.90 $0.08
increased
0.06%
increased
2.01%B
30 day
5/31/2019 16.48% 3.75% 9.55% 13.92% 8.32% (01/27/2004)
MGC Stock - Large-Cap Blend 0.07% $100.06 $0.08
increased
0.08%
increased
2.06%B
30 day
5/31/2019 15.88% 4.23% 9.86% 13.81% 8.10% (12/17/2007)
MGK Stock - Large-Cap Growth 0.07% $127.98 $0.55
increased
0.43%
increased
1.34%B
30 day
5/31/2019 19.82% 3.98% 11.45% 15.15% 9.69% (12/17/2007)
MGV Stock - Large-Cap Value 0.07% $79.52 -$0.22
decreased
-0.27%
decreased
2.82%B
30 day
5/31/2019 12.28% 4.03% 8.42% 12.60% 6.59% (12/17/2007)
VOO Stock - Large-Cap Blend 0.03% $265.73 $0.10
increased
0.04%
increased
2.07%B
30 day
5/31/2019 16.36% 3.75% 9.63% — 13.49% (09/07/2010)
VTI Stock - Large-Cap Blend 0.03% $147.19 -$0.43
decreased
-0.29%
decreased
1.97%B
30 day
5/31/2019 16.44% 2.57% 9.25% 13.98% 6.88% (05/24/2001)
VTV Stock - Large-Cap Value 0.04% $109.43 -$0.28
decreased
-0.26%
decreased
2.83%B
30 day
5/31/2019 12.62% 2.76% 8.18% 12.85% 7.66% (01/26/2004)
VXF Stock - Mid-Cap Blend 0.07% $116.17 $0.18
increased
0.16%
increased
1.46%B
30 day
5/31/2019 17.07% -3.64% 7.33% 14.00% 9.09% (12/27/2001)
VO Stock - Mid-Cap Blend 0.04% $164.20 -$0.20
decreased
-0.12%
decreased
1.62%B
30 day
5/31/2019 19.36% 1.63% 7.98% 14.37% 9.18% (01/26/2004)
VOT Stock - Mid-Cap Growth 0.07% $146.99 -$0.04
decreased
-0.02%
decreased
0.85%B
30 day
5/31/2019 23.15% 6.34% 9.48% 14.64% 8.88% (08/17/2006)
VOE Stock - Mid-Cap Value 0.07% $109.31 -$0.25
decreased
-0.23%
decreased
2.40%B
30 day
5/31/2019 15.37% -3.12% 6.43% 13.95% 7.86% (08/17/2006)
VB Stock - Small-Cap Blend 0.05% $153.73 $0.62
increased
0.40%
increased
1.65%B
30 day
5/31/2019 16.94% -3.78% 7.30% 14.15% 8.71% (01/26/2004)
VBK Stock - Small-Cap Growth 0.07% $182.81 $1.66
increased
0.91%
increased
0.80%B
30 day
5/31/2019 21.68% 0.63% 8.67% 14.89% 9.09% (01/26/2004)
VBR Stock - Small-Cap Value 0.07% $128.12 -$0.16
decreased
-0.12%
decreased
2.36%B
30 day
5/31/2019 12.87% -7.52% 6.11% 13.33% 8.18% (01/26/2004)
BNDW International 0.09% $78.18 $0.01
increased
0.01%
increased
1.89%B
30 day
6/13/2019 5.38% — — — 6.12% (09/04/2018)
BNDX International 0.09% $56.91 $0.00
0.00%
0.74%A
30 day
6/13/2019 5.41% 6.89% 4.09% — 4.05% (05/31/2013)
VWOB International 0.30% $79.00 $0.05
increased
0.06%
increased
4.95%A
30 day
6/13/2019 8.08% 7.35% 3.97% — 4.26% (05/31/2013)
VT International 0.09% $73.38 -$0.51
decreased
-0.69%
decreased
— 13.45% -1.93% 5.45% 9.54% 5.55% (06/24/2008)
VSGX International 0.15% $48.92 $0.07
increased
0.14%
increased
— 10.17% — — — -4.05% (09/18/2018)
VEU International 0.09% $49.35 -$0.56
decreased
-1.12%
decreased
— 9.94% -6.26% 1.59% 5.90% 2.47% (03/02/2007)
VSS International 0.12% $102.05 -$0.94
decreased
-0.91%
decreased
— 8.72% -13.13% 1.15% 7.15% 9.61% (04/02/2009)
VEA International 0.05% $40.45 -$0.44
decreased
-1.08%
decreased
— 10.72% -7.04% 1.57% 6.33% 1.08% (07/20/2007)
VGK International 0.09% $53.21 -$1.06
decreased
-1.95%
decreased
— 12.08% -6.02% 0.26% 6.32% 4.01% (03/04/2005)
VPL International 0.09% $64.26 -$0.41
decreased
-0.63%
decreased
— 7.08% -9.98% 3.47% 6.35% 4.27% (03/04/2005)
VNQI International 0.12% $57.99 -$0.37
decreased
-0.63%
decreased
— 11.51% -0.60% 4.09% — 5.61% (11/01/2010)
VIGI International 0.25% $65.79 -$0.43
decreased
-0.65%
decreased
— 15.25% -0.63% — — 9.75% (02/25/2016)
VYMI International 0.32% $59.46 -$1.03
decreased
-1.69%
decreased
— 8.32% -4.25% — — 8.75% (02/25/2016)
VXUS International 0.09% $51.65 $0.05
increased
0.10%
increased
— 9.83% -6.98% 1.54% — 3.03% (01/26/2011)
VWO International 0.12% $40.65 -$0.13
decreased
-0.32%
decreased
— 7.52% -6.47% 1.91% 4.72% 5.92% (03/04/2005)
VOX Stock - Sector 0.10% $86.19 $0.86
increased
1.00%
increased
1.07%B
30 day
5/31/2019 16.59% 2.94% 2.17% 8.89% 6.65% (09/23/2004)
VCR Stock - Sector 0.10% $178.29 $0.19
increased
0.11%
increased
1.33%B
30 day
5/31/2019 18.79% 2.31% 10.96% 18.11% 9.44% (01/26/2004)
VDC Stock - Sector 0.10% $150.75 -$0.45
decreased
-0.29%
decreased
2.77%B
30 day
5/31/2019 15.55% 13.12% 6.90% 12.32% 9.32% (01/26/2004)
VDE Stock - Sector 0.10% $81.86 $0.74
increased
0.91%
increased
3.67%B
30 day
5/31/2019 6.96% -22.79% -7.86% 2.75% 5.18% (09/23/2004)
VFH Stock - Sector 0.10% $67.57 -$0.66
decreased
-0.96%
decreased
2.35%B
30 day
5/31/2019 14.50% -3.53% 9.64% 12.42% 3.81% (01/26/2004)
VHT Stock - Sector 0.10% $172.04 $1.29
increased
0.75%
increased
1.53%B
30 day
5/31/2019 7.89% 5.71% 10.06% 15.57% 9.43% (01/26/2004)
VIS Stock - Sector 0.10% $141.22 -$0.56
decreased
-0.39%
decreased
1.73%B
30 day
5/31/2019 17.90% -1.58% 7.54% 14.44% 8.93% (09/23/2004)
VGT Stock - Sector 0.10% $205.62 $0.25
increased
0.12%
increased
1.34%B
30 day
5/31/2019 23.68% 7.61% 17.18% 18.10% 10.10% (01/26/2004)
VAW Stock - Sector 0.10% $124.76 -$1.01
decreased
-0.80%
decreased
2.13%B
30 day
5/31/2019 13.13% -11.39% 3.11% 9.92% 7.70% (01/26/2004)
VNQ Stock - Sector 0.12% $90.68 $0.94
increased
1.05%
increased
note 22.43% 14.91% 7.58% 14.92% 8.72% (09/23/2004)
VPU Stock - Sector 0.10% $134.60 -$0.55
decreased
-0.40%
decreased
3.21%B
30 day
5/31/2019 15.09% 17.11% 10.36% 12.69% 10.07% (01/26/2004)
We notice the following structure:
- Elements 0-58 are the names of the ETFs.
- Elements 59-118 are the tickers, asset class, etc. of the ETFs.
Create lists to hold ETF names and tickers
# Initialize the lists
etf_names = []
etf_tickers = []
for i in range(0, 59):
# Get the names
try:
etf_names.append(content[i].text.split('\n')[1])
except IndexError:
etf_names.append(content[i].text)
# Get the tickers
etf_tickers.append(content[i + 59].text.split()[0])
Create a DataFrame mapping an ETF’s name to it’s ticker
etf_map = pd.DataFrame({'Name': etf_names,
'Ticker': etf_tickers})
Vanguard ETFs By Group
Create a DataFrame mapping an ETF’s name to it’s ticker by group.
etf_us_bonds_map = etf_map.iloc[0:15]
etf_us_stocks_map = etf_map.iloc[15:33]
etf_intl_bonds_map = etf_map.iloc[33:36]
etf_intl_stocks_map = etf_map.iloc[36:48]
etf_sectors_map = etf_map.iloc[48:]
Source Vanguard ETFs Historical OHLCV Data
Set the parameters
- Source is set to the Investors Exchange.
- Start and end date are set to span 5 years from today.
source = 'iex'
start = '2014-01-01'
end = '2020-01-01'
Vanguard ETFs
Pull historical OHLCV data from the Investors Exchange
# List of tickers
etf_list = etf_map['Ticker'].tolist()
# Initialize data
etf_data = web.DataReader(etf_list[0], source, start, end)
# Conver index to datetime
etf_data.index = pd.to_datetime(etf_data.index)
for i in range(1, len(etf_list)):
# Pull new data
temp_data = web.DataReader(etf_list[i], source, start, end)
# Update data
etf_data = pd.concat([etf_data, temp_data], axis=1, sort=True)
# Set multi-level columns
etf_data.columns = pd.MultiIndex.from_product([etf_list, temp_data.columns])
Inspect
etf_data.head()
open | high | low | close | volume | |
---|---|---|---|---|---|
date | |||||
2014-06-12 | 85.2790 | 86.4896 | 84.7732 | 86.2077 | 34454 |
2014-06-13 | 85.6853 | 86.9375 | 85.6853 | 86.2824 | 36060 |
2014-06-16 | 86.4233 | 86.7762 | 86.0253 | 86.5809 | 72137 |
2014-06-17 | 86.3404 | 86.3404 | 85.5029 | 85.7336 | 55781 |
2014-06-18 | 85.9258 | 86.6058 | 85.6771 | 86.3902 | 15027 |
5 rows × 295 columns
etf_data.tail()
open | high | low | close | volume | |
---|---|---|---|---|---|
date | |||||
2019-06-05 | 125.13 | 125.5000 | 124.4900 | 124.60 | 301695 |
2019-06-06 | 125.85 | 126.5848 | 124.8075 | 125.26 | 137576 |
2019-06-07 | 126.65 | 126.9993 | 126.1300 | 126.56 | 88292 |
2019-06-10 | 125.42 | 125.4200 | 124.8400 | 125.20 | 153210 |
2019-06-11 | 125.12 | 125.8000 | 125.0900 | 125.48 | 187176 |
5 rows × 295 columns
Vanguard ETFs By Group
Create DataFrame objects containing close prices of ETFs by group.
# Create list of Vanguard ETF groups
groups = ['us bonds',
'us stocks',
'intl bonds',
'intl stocks',
'sectors']
# Create a dictionary of Vanguard ETF maps
maps = {'us bonds': etf_us_bonds_map,
'us stocks': etf_us_stocks_map,
'intl bonds': etf_intl_bonds_map,
'intl stocks': etf_intl_stocks_map,
'sectors': etf_sectors_map}
# Initialize Vanguard ETF close dictionary
close = {'us bonds': pd.DataFrame(),
'us stocks': pd.DataFrame(),
'intl bonds': pd.DataFrame(),
'intl stocks': pd.DataFrame(),
'sectors': pd.DataFrame()}
# Update Vanguard ETF close dictionary
for i in groups:
# Set tickers
tickers = maps[i]['Ticker'].tolist()
res = {}
for j in range(len(tickers)):
res[tickers[j]] = etf_data[tickers[j]].close
# Update close prices
close[i] = pd.DataFrame(res)
Vanguard US Bond ETFs
etf_us_bonds_close = close['us bonds']
Inspect
etf_us_bonds_close.head()
EDV | BIV | VGIT | BLV | VGLT | |
---|---|---|---|---|---|
date | |||||
2014-06-12 | 86.2077 | 72.3372 | 58.3027 | 72.9509 | 61.2526 |
2014-06-13 | 86.2824 | 72.1999 | 58.2387 | 72.9960 | 61.3021 |
2014-06-16 | 86.5809 | 72.2257 | 58.1847 | 73.0862 | 61.3876 |
2014-06-17 | 85.7336 | 71.9426 | 58.0467 | 72.6678 | 61.0080 |
2014-06-18 | 86.3902 | 72.2857 | 58.2103 | 73.0698 | 61.2963 |
etf_us_bonds_close.tail()
EDV | BIV | VGIT | BLV | VGLT | |
---|---|---|---|---|---|
date | |||||
2019-06-05 | 124.60 | 85.47 | 65.50 | 94.15 | 80.02 |
2019-06-06 | 125.26 | 85.40 | 65.44 | 94.50 | 80.27 |
2019-06-07 | 126.56 | 85.68 | 65.61 | 95.18 | 80.92 |
2019-06-10 | 125.20 | 85.40 | 65.40 | 94.63 | 80.20 |
2019-06-11 | 125.48 | 85.39 | 65.36 | 94.57 | 80.29 |
VTC and VTEB close prices are NaNs, we drop these funds.
etf_us_bonds_close.drop(['VTC', 'VTEB'], axis='columns', inplace=True)
Vanguard US Stock ETFs
etf_us_stocks_close = close['us stocks']
Inspect
etf_us_stocks_close.head()
VIG | ESGV | VUG | VYM | VV | |
---|---|---|---|---|---|
date | |||||
2014-06-12 | 69.5502 | NaN | 91.2609 | 56.2715 | 80.5827 |
2014-06-13 | 69.6492 | NaN | 91.4579 | 56.5027 | 80.8184 |
2014-06-16 | 69.6852 | NaN | 91.5235 | 56.6654 | 80.9272 |
2014-06-17 | 69.8382 | NaN | 91.7017 | 56.7596 | 81.1175 |
2014-06-18 | 70.5041 | NaN | 92.4053 | 57.1450 | 81.7520 |
etf_us_stocks_close.tail()
VIG | ESGV | VUG | VYM | VV | |
---|---|---|---|---|---|
date | |||||
2019-06-05 | 112.07 | NaN | 156.30 | 85.44 | 129.98 |
2019-06-06 | 112.72 | NaN | 157.51 | 86.00 | 130.77 |
2019-06-07 | 113.85 | NaN | 159.85 | 86.37 | 132.09 |
2019-06-10 | 114.09 | NaN | 160.82 | 86.55 | 132.76 |
2019-06-11 | 113.77 | NaN | 160.72 | 86.72 | 132.70 |
ESGV close prices are NaNs, we drop this fund.
etf_us_stocks_close.drop('ESGV', axis='columns', inplace=True)
Vanguard International Bond ETFs
etf_intl_bonds_close = close['intl bonds']
Inspect
etf_intl_bonds_close.head()
BNDW | BNDX | VWOB | |
---|---|---|---|
date | |||
2014-06-12 | NaN | 46.3357 | 63.9766 |
2014-06-13 | NaN | 46.3538 | 63.9528 |
2014-06-16 | NaN | 46.3357 | 63.8257 |
2014-06-17 | NaN | 46.2995 | 63.7462 |
2014-06-18 | NaN | 46.3448 | 63.7383 |
etf_intl_bonds_close.tail()
BNDW | BNDX | VWOB | |
---|---|---|---|
date | |||
2019-06-05 | NaN | 56.61 | 78.370 |
2019-06-06 | NaN | 56.67 | 78.640 |
2019-06-07 | NaN | 56.82 | 78.870 |
2019-06-10 | NaN | 56.71 | 78.960 |
2019-06-11 | NaN | 56.73 | 79.075 |
BNDW close prices are NaNs, we drop this fund.
etf_intl_bonds_close.drop('BNDW', axis='columns', inplace=True)
Vanguard International Stock ETFs
etf_intl_stocks_close = close['intl stocks']
Inspect
etf_intl_stocks_close.head()
VT | VSGX | VEU | VSS | VEA | |
---|---|---|---|---|---|
date | |||||
2014-06-12 | 55.1091 | NaN | 45.4997 | 96.7475 | 37.0462 |
2014-06-13 | 55.2604 | NaN | 45.5516 | 96.6253 | 37.0376 |
2014-06-16 | 55.2248 | NaN | 45.5256 | 96.3723 | 37.0635 |
2014-06-17 | 55.3317 | NaN | 45.4217 | 96.1280 | 37.0203 |
2014-06-18 | 55.6521 | NaN | 45.8633 | 96.9656 | 37.3402 |
etf_intl_stocks_close.tail()
VT | VSGX | VEU | VSS | VEA | |
---|---|---|---|---|---|
date | |||||
2019-06-05 | 72.77 | NaN | 49.43 | 102.31 | 40.50 |
2019-06-06 | 73.11 | NaN | 49.61 | 102.40 | 40.67 |
2019-06-07 | 73.85 | NaN | 50.13 | 103.50 | 41.15 |
2019-06-10 | 74.16 | NaN | 50.34 | 103.83 | 41.26 |
2019-06-11 | 74.37 | NaN | 50.68 | 104.21 | 41.46 |
VSGX, VIGI, and VYMI close prices are NaNs, we drop these funds.
etf_intl_stocks_close.drop(['VSGX', 'VIGI', 'VYMI'], axis='columns', inplace=True)
Vanguard Sector ETFs
etf_sectors_close = close['sectors']
Inspect
etf_sectors_close.head()
VOX | VCR | VDC | VDE | VFH | |
---|---|---|---|---|---|
date | |||||
2014-06-12 | 73.2093 | 99.4362 | 101.4614 | 121.4008 | 41.7458 |
2014-06-13 | 73.8089 | 99.5073 | 101.4878 | 122.6118 | 41.7458 |
2014-06-16 | 73.8688 | 99.8709 | 101.8401 | 123.2038 | 41.5917 |
2014-06-17 | 74.1943 | 100.2812 | 101.9634 | 123.0545 | 42.0086 |
2014-06-18 | 74.8623 | 100.9804 | 103.0908 | 123.9834 | 42.2352 |
etf_sectors_close.tail()
VOX | VCR | VDC | VDE | VFH | |
---|---|---|---|---|---|
date | |||||
2019-06-05 | 83.15 | 171.09 | 147.27 | 80.19 | 67.76 |
2019-06-06 | 83.43 | 171.70 | 148.60 | 81.49 | 67.98 |
2019-06-07 | 84.47 | 174.18 | 149.93 | 81.82 | 67.81 |
2019-06-10 | 84.47 | 175.76 | 149.98 | 81.99 | 68.37 |
2019-06-11 | 84.61 | 176.42 | 150.73 | 82.12 | 68.40 |
Plot Of Vanguard ETFs By Group
plt.figure(figsize=[20, 40])
# US Bonds
plt.subplot(5, 1, 1)
for i in etf_us_bonds_close.columns.tolist():
plt.plot(etf_us_bonds_close[i])
plt.title('Vanguard US Bonds ETFs - Close')
plt.legend(etf_us_bonds_close.columns.tolist(), loc=2)
plt.xlim(etf_data.index[0], etf_data.index[-1])
# US Stocks
plt.subplot(5, 1, 2)
for i in etf_us_stocks_close.columns.tolist():
plt.plot(etf_us_stocks_close[i])
plt.title('Vanguard US Stocks ETFs - Close')
plt.legend(etf_us_stocks_close.columns.tolist(), loc=2)
plt.xlim(etf_data.index[0], etf_data.index[-1])
# Intl Bonds
plt.subplot(5, 1, 3)
for i in etf_intl_bonds_close.columns.tolist():
plt.plot(etf_intl_bonds_close[i])
plt.title('Vanguard International Bond ETFs - Close')
plt.legend(etf_intl_bonds_close.columns.tolist(), loc=2)
plt.xlim(etf_data.index[0], etf_data.index[-1])
# Intl Stocks
plt.subplot(5, 1, 4)
for i in etf_intl_stocks_close.columns.tolist():
plt.plot(etf_intl_stocks_close[i])
plt.title('Vanguard International Stock ETFs - Close')
plt.legend(etf_intl_stocks_close.columns.tolist(), loc=2)
plt.xlim(etf_data.index[0], etf_data.index[-1])
# Sectors
plt.subplot(5, 1, 5)
for i in etf_sectors_close.columns.tolist():
plt.plot(etf_sectors_close[i])
plt.title('Vanguard Sector ETFs - Close')
plt.legend(etf_sectors_close.columns.tolist(), loc=2)
plt.xlim(etf_data.index[0], etf_data.index[-1])
(735396.0, 737221.0)

Save The Data
Create a list of output files
output = ['etf_us_bonds.pickle',
'etf_us_stocks.pickle',
'etf_intl_bonds.pickle',
'etf_intl_stocks.pickle',
'etf_sectors.pickle']
Create a list of ETF data
data = [etf_us_bonds_close,
etf_us_stocks_close,
etf_intl_bonds_close,
etf_intl_stocks_close,
etf_sectors_close]
Pickle
for i in range(len(output)):
with open(output[i], 'wb') as f:
pickle.dump(data[i], f)
f.close()
Conclusion
In this article, I showed how to scrape ETFs list from Vanguard web page, and pull historical OHLCV data from IEX. I used Python packages Selenium and pandas-datareader for web scraping and sourcing datasets, respectively. In the next article, I use the data to demonstrate the application of Kelly formula in investing.