Screaming Frog SEO Spider

Screaming Frog SEO Spider Update – Version 22.0

We’re delighted to announce Screaming Frog SEO Spider version 22.0, codenamed internally as ‘knee-deep’.

This release includes updates based upon user feedback, as well as exciting new features built upon the foundations introduced in our previous release.

So, let’s take a look at what’s new.


1) Semantic Similarity Analysis

You can now analyse the semantic similarity of pages in a crawl to help detect duplicate, similar and potentially off-topic, less relevant content on a site.

This goes beyond matching text on a page found in our duplicate content detection, by utilising LLM embeddings, which capture the semantic meaning and relationship of words.

This makes it possible to identify similar pages with different phrases but overlapping themes, covering the same subject multiple times, which can cause cannibalisation or inefficiencies in crawling and indexing.

If you’re not familiar with embeddings, then check out Mike King’s piece on Vector Embeddings is All You Need. Many SEOs have been inspired to experiment and build various tools with these concepts.

Using our existing AI provider integrations via ‘Config > API Access > AI’ (including OpenAI, Gemini & Ollama) you can capture vector embeddings of pages.

Dynamic SEO Pro Gemini Embeddings

You can now enable their use in the SEO Spider via ‘Config > Content > Embeddings’ for semantic content analysis, semantic search and visualisations.

Dynamic SEO Pro Embeddings Configuration

When the crawl has completed and crawl analysis has been performed, the ‘Semantically Similar’ and ‘Low Relevance Content’ filters will be populated in the Content tab.

Please refer to our user guide on configuring embeddings.

Semantically Similar Pages

The Content tab and ‘Semantically Similar‘ filter will show the closest semantically similar address for each URL, as well as a semantic similarity score and number of URLs that are semantically similar.

Dynamic SEO Pro Semantically Similar Pages

The lower ‘Duplicate Details’ tab and ‘Semantic Similarity’ filter will show all semantically similar URLs, as well as the content analysed.

Semantic similarity scores range from 0 – 1. The higher the score, the higher the similarity to the closest semantically similar address.

Pages scoring above 0.95 are considered semantically similar by default. The semantic similarity threshold can be adjusted via ‘Config > Content > Embeddings’ down to as low as 0.5.

Low Relevance Content

Vector embeddings can also be used to detect pages that are potentially off-topic compared to the overall content theme by averaging the embeddings of all crawled pages to identify the ‘centroid’.

Measuring the deviation of page embeddings from a site embedding is something that was hinted at within the Google leak, and SEOs have been playing with this concept to find outliers.

Outliers are those furthest from the average, and might indicate low relevance, ‘more off-topic’ content than is published elsewhere on the site.

Pages below the threshold can be seen under the ‘Content’ tab and ‘Low Relevance Content’ filter.

Dynamic SEO Pro Low Relevance Content filter

For our site this suggests blog content around the Olympic torch coming to Henley, a recent post on returning to work after maternity and our login page as outliers compared to the rest of the more technical SEO focused content on the site.

While we’re not going to remove these pages, it’s fair to suggest that the content of these pages deviates from the usual focus of the site.

Read our full tutorial on How to Identify Semantically Similar Pages & Outliers.

The semantic similarity analysis can be used for more than just detecting near duplicates and low relevance content as well, such as:

  • Improving Internal Linking – The lower ‘Duplicate Details’ tab, and ‘Semantic Similarity’ filter can be used to improve internal linking between semantically similar content.
  • URL Mapping for Redirects – Crawl old and new websites together and get a list of closest semantically similar URLs based upon the page text for redirects.
  • Semantic Similarity Analysis of any Element – Select ‘page titles’ instead of ‘page text’ for the embeddings, and run a semantically similar analysis to find near duplicate titles instead etc.

We’re excited to see the different use-cases and ways this new functionality is used, which will in-turn inspire the evolution within the tool.


2) Semantic Content Cluster Visualisation

The Content Cluster Diagram is available via ‘Visualisations > Content Cluster Diagram’. It’s a two-dimensional visualisation of URLs from your crawl, plotted and clustered from embeddings data.

It can be used to identify patterns and relationships in your website’s content, where semantically similar content are clustered together.

The example diagram above highlights the semantic relationship of an animal website. It’s fascinating to see how semantics mimic animal taxonomy –

Tiger populations tightly grouped together, with the nearest neighbour the Liger hybrid inbetween the Tiger and the Lion, and then other big cats such as Leopards, Jaguars, Cheetahs as the next neighbours and so on.

The diagrams can be useful to visualise the scale of clusters of content across a site or identify potential topical clusters that are semantically related yet might be distantly integrated for the user.

Dynamic SEO Pro BBC Content Cluster Diagram

In the diagram above, you can easily see the scale of different sections, such as recipes on the BBC.

You can also spot outliers that are isolated from other nodes on the edges of the diagram, such as those mentioned on our site earlier.

Dynamic SEO Pro Content Cluster Diagram Outliers

The cog allows you to adjust the sampling, dimension reduction, clustering and colour schemes used. The content cluster diagram also works alongside segments, so you can visualise content in one specific area or section of a site.

We have plans to compliment these diagrams with crawl data for more insights.


3) Semantic Search

There’s a new right-hand ‘Semantic Search’ tab, which allows you to enter a search query and see the most relevant pages in a crawl.

This functionality vectorises the search query and calculates the cosine similarity between the query and pages in a crawl using vector embeddings rather than keywords.

It can help quantify the relevance of content to a query for all pages in a crawl, and is more akin to how modern search engines and LLMs return content today, rather than more simplistic keyword presence and matching within text.

Dynamic SEO Pro Semantic Search tab

This functionality can be used to find relevant pages for keyword mapping, related pages for internal linking, or competitor analysis against keywords as examples.

The ‘Embedding Display’ filter can be adjusted to ‘Centroid’, to see more details about outliers found on the website and the ‘most representative page’, that is closest to the average embedding across the whole site.

Dynamic SEO Pro Semantic Search centroid

If you’ve pulled embeddings from a variety of LLMs you can adjust the filter at the top to view the different results.

Similar to the other features launched, it’s obvious how this feature could be extended in the tool in future updates.


4) AI Integration Improvements

We’ve introduced a variety of improvements for our AI integration to make it even more advanced, flexible and to help reduce waste of credits and queries. This includes:

Multiple Prompt Targets

You can now click the cog against a prompt and write a more advanced prompt, including multiple prompt target elements.

Dynamic SEO Pro Multiple Prompt Targets

Run Prompts For Specific Segments & Issues

You’re able to choose to run AI prompts against URLs that match a specific segment. This means you can set up segments for different scenarios you wish AI prompts to be run against, and not waste credits.

In the advanced prompt, you can choose to ‘Match on Segment’.

Dynamic SEO Pro Run prompts against issues and segments

Alongside this, you’re now able to segment based upon ‘Issues’.

For example, this means you can create image alt text only for image URLs in the segment with the issue ‘Missing Alt Text’, rather than every image.

Reference URL Details

URL Details data can now be selected to be used in AI prompts for further flexibility.

Dynamic SEO Pro URL Details in AI prompts

Custom Endpoint

You can now customise the OpenAI endpoint, which allows users to enable private LLM APIs and other AI providers that use the same structure.

For example, you can use DeepSeek, Microsoft Copilot, or Grok by customising the endpoint and using the relevant API key.

Dynamic SEO Pro Custom Endpoint

You can also customise the model parameters, headers, and limit page content length to reduce token exceeded errors on long content pages.

Anthropic Integration

Similar to the integrations of OpenAI, Gemini and Ollama, you can now integrate with Anthropic (aka ‘Claude’) via ‘Config > API Access’ to run AI prompts while crawling.

Dynamic SEO Pro Anthropic API integration

Generate Images & Text Speech

We had some fun and integrated image and text speech generation for OpenAI and Gemini. As an example, this can be used to crawl blog posts, and create a hero image for each of them.

Dynamic SEO Pro Image Generation using AI

The SEO Spider will show an image or sound preview in the UI, which you can expand, or listen to.

Read our full tutorial on How To Crawl With AI Prompts.


5) Advanced Column Configurator

In the same way you can customise tabs, you can now configure columns with an advanced configurator that allows them to be selected, hidden and adjusted in order in bulk.

Dynamic SEO Pro Advanced Column Configurator

This should make customising columns less painful.


6) Custom Multi-Export

There’s a new ‘Multi Export’ option under the ‘Bulk Export’ menu. This allows you to select any tab, bulk export or report to export in a single click.

Dynamic SEO Pro Multi Export

If there’s a common set of reports you use for crawls, or have specific exports for some websites, then you can save them as presets and use them when needed both manually in the UI, or in scheduling and the CLI.

This new functionality also enables you to run the Export for Looker Studio from a manual crawl, rather than only from within scheduling.


7) Export to Multiple Tabs In Single Sheet/Workbook

When you bulk export multiple exports manually or from within scheduling, you can now select to ‘consolidate spreadsheets’.

Rather than export each tab, bulk export or report as a separate file, it will export everything to multiple individual tabs within the same single Google Sheet or Workbook.

Dynamic SEO Pro Export multiple tabs in a single sheet or workbook

This is available for both Google Sheets and Excel.


8) Download Multiple XML Sitemaps

In list mode you can now upload multiple XML Sitemaps, instead of relying on a Sitemap Index file.

Dynamic SEO Pro Download Multiple XML Sitemaps


9) Download from Google Sheets

In list mode, you can select the source as a Google Sheet address. Any URLs within the Google Sheet will be uploaded and crawled.

Dynamic SEO Pro Download from Google Sheets URL

You’re able to input your Google Drive details so the SEO Spider can access private Google Sheets.

This feature has exciting automation potential, as you can dictate the URLs to be crawled using Google Sheets (and associated add-ons and app scripts).

This is also available in scheduling and the CLI.


10) Fetch API Data Without Crawling or Re-Crawling

There’s a new ‘APIs’ mode (‘Mode > APIs’), which allows you to upload URLs and pull data from any APIs – without any crawling involved for speed.

Dynamic SEO Pro APIs mode

Additionally, there’s been more API improvements:

  • The ‘Request API Data’ button in the right-hand APIs tab is now enabled anytime you pause a crawl with a connected API, not just at the end of a completed crawl. Pressing it will resume the API requests (but not the crawl) effectively allowing you to sync all the API data for the URLs you have crawled so far.
  • If you modify GA4/GSC config, a dialog will appear before the config window closes asking if you want to remove all existing data and request and apply the new data. Previously if you connected to GA4/GSC, you couldn’t remove the data or re-fetch it. Now you can.
  • You can now right click any URL and request data for any of the connected APIs (apart from GA4/GSC). If the crawl already has existing data, this data will be replaced by the new request. These requests will take priority over any other requests in the queue which means they should show up in the table straight away for the user to see. This works for when you are either paused or crawling.

Other Updates

Version 22.0 also includes a number of smaller updates and bug fixes.

  • There’s a new ‘Save’ icon next to AI prompts and custom JavaScript snippets which allow you to quickly save them to the library.
  • All visualisations now have the option to open in an external browser, which can improve performance at scale.
  • Holding ‘control + shift + C’ together will now bring up a configuration diff window to quickly spot any differences between the current config and the default.
  • The Moz API has now been updated to v.3. Metrics such as link propensity, spam score and brand authority are now available alongside DA, PA and link numbers.
  • You can now select to pull Trust Flow Topics via the Majestic API integration.

That’s everything for version 22.0! After writing this post, we quickly realised there was enough features for two new releases. So if you stayed until the end, thank you!

Thanks to everyone for their continued support, feature requests and feedback. Please let us know if you experience any issues with this latest update via our support.


Small Update – Version 22.1 Released 18th June 2025

We have just released a small update to version 22.1 of the SEO Spider. This release is mainly bug fixes and small improvements –

  • Added custom endpoint functionality to the Ollama integration (to match functionality in OpenAI and Gemini).
  • Improved error icons to show tooltip when clicked on.
  • Fixed issue exporting to Google Sheets in non-English languages.
  • Added Dimension Reduction Presets for Embeddings.
  • Added response codes to ‘Missing Confirmation Links’ Report.
  • Fixed issue with stall on start up for some Windows users.
  • Fixed issue with missing columns in the new advanced column chooser.
  • Fixed various unique crashes.

Small Update – Version 22.2 Released 2nd July 2025

We have just released a small update to version 22.2 of the SEO Spider. This release is mainly bug fixes and small improvements –

  • Added a configurable minimum score threshold for Semantic Search results.
  • Added ‘Bulk Export’ for ‘URL Inspection’ options in APIs Mode.
  • Added ‘Command+F’ shortcut to focus the search box in the crawls dialog.
  • ‘Export > Output Settings > Output Mode’ setting now remembers state, which was confusing users.
  • Fixed ‘Show Issue in Rendered HTML’ which wasn’t working reliably.
  • Fixed issue with ‘Overwrite Files’ in output settings causing tabs to be deleted in Google Sheets.
  • Fixed issue with ‘Bulk Export > JavaScript > Contains JavaScript Links’ returning no URLs.
  • Fixed issue with audio being played after using Visual Custom Extraction for some sites.
  • Added Passage Embeddings Snippet to Custom JS System Library. Thanks to Noah Learner!
  • Added Indexing Insight Custom JS Snippet To Custom JS ‘System’ Library. Thanks to Adam Gent!
  • Fixed various unique crashes.

The post Screaming Frog SEO Spider Update – Version 22.0 appeared first on Screaming Frog.

Related Topics
Screaming Frog