in Search Engine Optimization, Web Apps

Using diff to Diagnose SEO Issues: Leveraging Wayback Machine & The OpenAI API [Free Tool]

When figuring out if ranking drops result from a change on Google’s side or your own, it’s helpful to know if anything changed on the page that lost rankings.

Sometimes, these changes aren’t well documented, especially if multiple stakeholders can edit pages. Or there may not be a changelog system that enables you to find what was changed. It may even be that the changelog system doesn’t show the complete picture and doesn’t include broader changes that impact internal linking or alterations within the HTML head or anything outside the main content area. Although I think it’s best to work towards creating an internal system that accounts for all of this, it’s not feasible for every organization or SEO professional. What do you do?

Let me introduce you to SEODiff, a Streamlit-based web app to help solve the aforementioned problem.

👉 Check out SEODiff 👈

Wayback Machine

Wayback Machine has long been an important tool for preserving the historic web, maintaining transparency, and of course as an SEO tool.

I highly encourage everyone to donate to them, because the web is made a much better place by the project.

You can manually leverage Wayback Machine. This would involve input a URL, choose a archive from an available date, viewing the html, doing that on a current version of the page, and then copy & pasting the html of each of those pages into some sort of diff checker in order to compare them. What a pain!

Wayback Machine does offer a “Changes” feature to compare two versions, but it only works for a visual comparison, not the html.

Wayback Machine “Changes” Feature

How to Use SEODiff

Instead of doing these manual comparisons with Wayback Machines, head over to SEODiff.

Input a URL into the field on the left-hand sidebar and hit Enter.

Then choose whether you want to use an archived version from Wayback Machine, using “Archived” radio button and selecting the date of the archive you’d like to use from the Wayback Machine archive, or you can choose the “Current” radio button and it will perform a GET request on the URL (probably doesn’t make much sense to do the latter).

Do the same for HTML Source 2. Choosing the “Current” radio button makes a lot more sense for this one.

Now choose if you want to only show what changed in the HTML by checking the “Show Only Changes” checkbox, or whether to include the full source, even if it didn’t change (you’ll still be able to tell what changed with this option).

You may also choose whether you want to perform the diff on the Full HTML, or just the HEAD, or just the BODY

Lastly, click the “Fetch HTML for Comparison” button to complete the diff.

Depending which options you chose, in the scrollable box, you’ll be presented with which lines were removed (in red), which lines were added (in green), or which lines remain unchanged (shown with the same white background of the page).

Expanding Its Analysis with LLMs

Want to take things a step further with some AI? SEODiff has you covered.

Instead of scrolling through the html and seeing what was added or removed, grab your OpenAI API key, add it to the AI Analysis section of SEODiff. Hit Enter.

Choose a model. Most of the pages I tested had a large context length, and required a model with a large context window. At the time of this post, gpt-4-1106-preview (preview version of gpt-4-turbo) is the model with the largest context window (128k tokens).

Important Note: This will incur a fee. I am not responsible for any fees you incur when using the tool. See OpenAI’s pricing page to understand their pricing structures for their various models. This will consume A LOT of tokens.

You may customize the prompt if you wish. If you’re satisfied with the prompt, and are okay with the fee you’re about the incur, then go ahead and click “Analyze Diff”.

This may take a bit, so be patient. The end result is a summary analyzing the most important changes according to ChatGPT.

Example analysis of the diff, completed by GPT-4

Happy comparing!

Write a Comment


This site uses Akismet to reduce spam. Learn how your comment data is processed.