Nearest words

Edit distance is a way of calculating the distance ("nearness") of two words. This nearness search is often called "approximate search". One of the most common method is Levenshtein distance, see for example http://www.merriampark.com/ld.htm or more general google.

The edit distance counts how many operations (delete, insert, substitution) that is needed for transforming one word to another. E.g. the edit distance between the word hakan and the word håkan is 1, since we need one substitution ("å" is substituted for "a"). The distance of hakan and kalle is 5.
This program shows the nearest words to the searched word, using the Levenshtein Distance. It first shows the words at nearest distance, then the words with the next distance (i.e. distance + 1). (If you don't type in a word, the program will use kjellerstrand, my last name).

Word:
Language: Swedish English




Some statistics of edit distances for Swedish and English

The longest distance I found in Swedish is 27 with the following words:
For English the longest distance is 24: The next distance is 20 for the following word pairs
A very nice approximate search program is agrep ("approximate grep").
Also see Generate spelling errors which is, in a way, an "inverted edit distance"
Back to my homepage
Back to my other useless programs
Created by Hakan Kjellerstrand hakank@bonetmail.com