Paul Shapiro's Search Wilderness

Calculate Keyword Growth Using Google Trends, Python, and the Slope Formula

During a presentation I gave at Distilled’s SearchLove Boston conference in early May, I advocated that people use the slope formula and Google Trends data to determine if interest keywords have grown over time or if they are slipping away into searcher oblivion.

The only problem with this, is that unfortunately Google doesn’t provide an official API for Google Trends, so we need some Python wizardry to do this in bulk.

Make sure you have the pandas Python library installed. If you’re running Windows, using a Python Distribution like Anaconda will make this whole lot easier.

Next, install the pytrends library using pip:

pip install pytrends

Awesome! You’re almost there.

Next, create a CSV file with a single column named, Keywords (it’s case sensitive). In this column, enter all the keywords for which you would like to know their slope.

Download the following script to the same folder as your csv file with the keywords.

The Python Script:

# Requires pandas. For Windows users, I recommend installing the Anaconda Python distirbution.
# Requires the pytrends library. To install, run "pip install pytrends".
from pytrends.pyGTrends import pyGTrends
import time
import os
from random import randint
import pandas as pd

# Add your Gmail username to the google_username variable and your Gmail password to the google_password variable.
google_username = ""
google_password = ""
connector = pyGTrends(google_username, google_password)

# This script downloads a series of CSV files from Google Trends. Please specify a filepath for where you'd like these files to be stored in the below variable.
path = ""

# Specify the filename of a CSV with a list of keywords in the variable, keyordcsv. The CSV should be one column, with header equal to Keywords (case sensitive).
keywordcsv = "keywords.csv"
keywords = pd.read_csv(keywordcsv)

# Downloads and Calculate Slope:
keywordlist = pd.DataFrame(columns=["keyword","slope"])
for index, row in keywords.iterrows():
    print("Downloading Keyword #" + str(index))
    payload = {'geo': 'US', 'q': [row[0]]}
    connector.request_report(payload)
    time.sleep(randint(5, 10))
    connector.save_csv(path, str(index))
    csvname = str(index)+".csv"
    trenddata = pd.read_csv(csvname, skiprows=4, names=['date', 'values'])
    keyword = trenddata['values'].loc[[0]][0]
    trenddata = trenddata.ix[1:]
    trenddata['keyword'] = keyword
    trenddata.rename(columns={'values': 'trends'}, inplace=True)
    trenddata['trends'] = pd.to_numeric(trenddata['trends'], errors='coerce')
    trenddata['date'] = trenddata['date'].str.extract('(^[0-9]{4}\-[0-9]{2}\-[0-9]{2}) \-.*')
    trenddata = trenddata.dropna()
    trenddata['date'] = pd.to_datetime(trenddata['date'])
    trenddata['year'] = pd.DatetimeIndex(trenddata['date']).year
    trenddata['month'] = pd.DatetimeIndex(trenddata['date']).month
    trenddata['day'] = pd.DatetimeIndex(trenddata['date']).day

    maxyear = trenddata['year'].max()
    grouped = trenddata.groupby(['year']).mean()

    def slope_formula(xone, yone, xtwo, ytwo):
        return (ytwo-yone)/(xtwo-xone)

    maxyear = trenddata['year'].max()
    grouped = trenddata.groupby(['year']).mean()
    slope = slope_formula(1,float(grouped.loc[grouped.index==maxyear-2]['trends']),
                          2,float(grouped.loc[grouped.index==maxyear-1]['trends']))
    keywordlist = keywordlist.append({'keyword':keyword,'slope':slope}, ignore_index=True)
    os.remove(csvname)

# Specify a csv filename to output the slope values.
keywordlist.to_csv("trends_slope.csv", sep=",", encoding="utf-8", index=False)

print("Slope calculation and CSV export complete.")

Add your Google username to the google_username variable and your Google password to the google_password variable.

Run the Python script:

python GoogleTrendsSlopeCalculator.py

It will take some time, depending on the number of keywords you’re examining. For each keyword, the script downloads an export from Google Trends. The files are stored temporarily and then deleted.

What’s the Deal with Slope?

Slope calculates the change in y or the change in x on a graph, or rise over run. In our case, we are looking at the change in search interest over a change in time, so we use slope.

Looking at historic search interest, allows us to account for typical seasonality. If Google Keyword Planner provided search volume data for more than a single year, we would use that data instead, but since they do not, we rely on Google Trends data.

For example, Keyword X has an average search interest of 49 this year, but the previous year it had an average search interest of 45.

The slope would be calculated as 49-45/2016-2015, which would equal 4, indicating small, but positive growth.

Google Trends Trend Lines Aren’t Straight!

If you’re savvy (or remember elementary school math), you know that in order to calculate slope from one year to another like we want, it would require use to have a straight line. Since our trend line will never be perfectly straight, we solve for this by averaging the search interest values from Google Trends for each year.

TL;DR

We can use the slope formula to determine whether a keyword has grown or not in search interest. In order to accomplish the slope calculation for a collection of keywords, you can use my nifty Python script.

Get the Search Wilderness Newsletter

View Regular Website Experience