Levenshtein Distance in SQL Server

11 January 2005

Merry Christmas, Happy New Year, and all that. This is a kind of dull post but I know some people who come here will find it useful:

I have found an implementation of the Levenshtein distance function in TSQL. This allows you to calculate the similarity between two strings, which is particularly useful for searching features. You will also need a simple user-defined function called MIN3 that is not listed at the above link, here’s the super quick one I wrote:

1   CREATE function Min3( @a int, @b int, @c int ) returns int AS
2   BEGIN
3   	DECLARE @Retval int
4   	if @a <= @b AND @a <= @c
5   		BEGIN
6   		SET @Retval = @a
7   		END
8   	if @b < @a AND @b <= @c
9   		BEGIN
10  		SET @Retval = @b
11  		END
12  	if @c < @a AND @c < @b
13  		BEGIN
14  		SET @Retval = @c
15  		END
16  RETURN @Retval
17  END

One Response

  1. Joe Celko says:

    Learn how to write this CASE WHEN in a single CASE expression; it will port and run MUCH faster.

    CASE WHEN @b BETWEEN @a AND @c THEN @a
    WHEN @c BETWEEN @a AND @b THEN @a
    WHEN @a BETWEEN @b AND @c THEN @b
    WHEN @c BETWEEN @b AND @a THEN @b
    WHEN @a BETWEEN @c AND @b THEN @c
    WHEN @b BETWEEN @c AND @a THEN @c
    ELSE NULL END

    This also handles NULLs correctly.