Cleaning out UrlTracker

Tuesday, March 27, 2018

The Url Tracker is a great tool and is installed on most of the websites we manage. Its used to manage URLs within Umbraco.

It automatically tracks URL changes to ensure the old URL will redirect to the new location which is great for SEO and great for people visiting your website via this old URL. Although this is now an OOTB feature of Umbraco.

What this tool does include, which is missing from the umbraco Core is the ability to create you own redirects, which is handy when migrating existing indexed URLs to your new website. and its ability to record 404 errors and create more redirects from them.

The problem with recording every 404 error is that it can fill up the database table with millions or records. Occasionally this can actually slow the URL Tracker UI to the point where it becomes unusable. 

One client had nearly 50 million 404 records
One client had nearly 50 million 404 records

The following SQL script will purge all 404 records, but retain the existing redirects.

Delete from icUrlTracker 
Where RedirectNodeId is null
and RedirectUrl is null

Update 09 May 2018: I got bored waiting for the script to run on one of our sites so i started looking into ways to batch the clean up query. It also turns out that running a script like the one above can have an quite a large impact on the transaction logs and as a consequence a impact on the performance of the server 

I found this post which seems to work nicely. https://sqlperformance.com/2013/03/io-subsystem/chunk-deletes.

so, now for my re-written script:

SET NOCOUNT ON;
 
DECLARE @r INT;
 
SET @r = 1;
 
WHILE @r > 0
BEGIN
  BEGIN TRANSACTION;
 
  DELETE TOP (10000) -- this will change
    From dbo.icUrlTracker
    Where RedirectNodeId is null 
	and RedirectUrl is null
 
  SET @r = @@ROWCOUNT;
  print N'ROWCOUNT: '+ cast(@r as varchar(20))
 
  COMMIT TRANSACTION;
END

Although the scripts still takes and age to run, there is something reassuring when you can see the count go down.