Generate Simple Anagrams

This program shows different methods for generating anagrams which should be (somewhat) easy to solve. The methods are:

n-step method: for each turn, transpose two letters within a maximal distance, where the n in n-step is the maximum step to use, e.g. 1 or 2 to get easy solving anagrams.
generate anagrams of a word in some way (e.g. n-step method or simply shuffle the letters in the word), and then sort the word according to the best n-gram distance.
generate the anagrams in some way and sort them according to edit distance, see below.

n-gram distance
The n-gram distance compares the n-grams from the original word with the n-gram for the anagramated (shuffled) word. The more corresponding n-grams, the better. For both words it uses all possible n-grams, from 2-gram to (length word -1)-ngram. E.g. for the word anagram the n-grams used is "an", "na", ""ag", "gr", "ra", "am", "ana", "nag", ... "gram", "anag", ... "gram", "anagr","nagr". The distance is defined as: the number of n-grams that are the same (as the original word) divided by maximal number of n-grams for the word. This means that the distance may be from 0 (no likeness between the words at all) to 1 (same word).

Edit distance The (Levenshtein) edit distance is a common metric for comparing the distance between two words, i.e. how similiar the are. It seems that this metric is not very good for comparing how well we recognize the word (i.e. solve the anagram), but this notion is just subjective findings and hold no scientific bearing.

Shuffling Please note that the (random) shuffling part make take some time. The number of shuffled words is 500.

See also: A program which use a different approach to approximately the same problem is Reading scrambled words; and possible also Generate spelling errors.

Note: This program was announced at my (swedish) blog: Skapa enkla anagram, where there may be some more info.

Result

Word: establishment
Max steps: 1
Random word: no

n-step Method

Result by n-step method. This is for max step = 1. Showing the first 20 words.
The number and charactes in parenthesis is the specific steps.
(5 <-> 6 : f <-> l edit dist: 2 ngram dist: 0.39 pct same positions: 0.79)
means that the 5'th and 6'th characters we transposed, which was the letter "f" and "l". The edit distance was 2 and ngram distance 0.39, and has 79% characters in the same positions as the source word.

setablishment (0 <-> 1 : s <-> e edit dist: 1 ngram dist: 0.71 pct same positions: 0.85)
setablsihment (7 <-> 6 : i <-> s edit dist: 3 ngram dist: 0.21 pct same positions: 0.69)
setablsihment (12 <-> 12 : t <-> t edit dist: 3 ngram dist: 0.21 pct same positions: 0.69)
setbalsihment (4 <-> 3 : a <-> b edit dist: 5 ngram dist: 0.13 pct same positions: 0.54)
setbalsihment (0 <-> 0 : s <-> s edit dist: 5 ngram dist: 0.13 pct same positions: 0.54)
setbalsimhent (8 <-> 9 : m <-> h edit dist: 6 ngram dist: 0.04 pct same positions: 0.38)
setbalsimehnt (9 <-> 10 : e <-> h edit dist: 6 ngram dist: 0.03 pct same positions: 0.31)
sebtalsimehnt (2 <-> 3 : b <-> t edit dist: 5 ngram dist: 0.04 pct same positions: 0.23)
sbetalsimehnt (1 <-> 2 : b <-> e edit dist: 5 ngram dist: 0.04 pct same positions: 0.23)
sbetalsiemhnt (9 <-> 8 : m <-> e edit dist: 6 ngram dist: 0.03 pct same positions: 0.31)
sbetalsimehnt (8 <-> 9 : m <-> e edit dist: 5 ngram dist: 0.04 pct same positions: 0.23)
sbetalsimehtn (12 <-> 11 : n <-> t edit dist: 5 ngram dist: 0.03 pct same positions: 0.08)
sbetalsimehtn (12 <-> 12 : n <-> n edit dist: 5 ngram dist: 0.03 pct same positions: 0.08)
sbetalsimhetn (9 <-> 10 : h <-> e edit dist: 6 ngram dist: 0.01 pct same positions: 0.15)
sbetalsimhetn (0 <-> 0 : s <-> s edit dist: 6 ngram dist: 0.01 pct same positions: 0.15)
bsetalsimhetn (1 <-> 0 : s <-> b edit dist: 6 ngram dist: 0.01 pct same positions: 0.23)
bsetalsimhetn (12 <-> 12 : n <-> n edit dist: 6 ngram dist: 0.01 pct same positions: 0.23)
bsetalsmihetn (7 <-> 8 : m <-> i edit dist: 7 ngram dist: 0.01 pct same positions: 0.23)
bsetalsmihent (12 <-> 11 : t <-> n edit dist: 6 ngram dist: 0.05 pct same positions: 0.38)

Edit distance sort

The words generated by n-step method sorted by (Levenshtein) edit distance. Showing the first (best) 20 words which the edit distance for the anagram. Also: "step pos" is the position in the generated n-step word, i.e. in what turn that anagram was generated. "ngram dist" is the ngram distance (see above).

setablishment: 1 (step pos: 1 ngram dist: 0.71 pct same positions: 0.85)
setablsihment: 3 (step pos: 3 ngram dist: 0.21 pct same positions: 0.69)
esnabtlsheitm: 5 (step pos: 274 ngram dist: 0.04 pct same positions: 0.46)
ensabltshetmi: 5 (step pos: 282 ngram dist: 0.06 pct same positions: 0.46)
bstaelshmniet: 5 (step pos: 42 ngram dist: 0.08 pct same positions: 0.38)
sbetalsimehnt: 5 (step pos: 11 ngram dist: 0.04 pct same positions: 0.23)
sebtalsimehnt: 5 (step pos: 8 ngram dist: 0.04 pct same positions: 0.23)
sbetalsimehtn: 5 (step pos: 13 ngram dist: 0.03 pct same positions: 0.08)
batnsestlihme: 5 (step pos: 164 ngram dist: 0.09 pct same positions: 0.08)
setbalsihment: 5 (step pos: 5 ngram dist: 0.13 pct same positions: 0.54)
banstlsmethei: 6 (step pos: 187 ngram dist: 0.03 pct same positions: 0.08)
banstlsmetehi: 6 (step pos: 188 ngram dist: 0.03 pct same positions: 0.15)
ensbatlshetmi: 6 (step pos: 280 ngram dist: 0.03 pct same positions: 0.23)
setbalsimehnt: 6 (step pos: 7 ngram dist: 0.03 pct same positions: 0.31)
bstaelsmihnte: 6 (step pos: 29 ngram dist: 0.05 pct same positions: 0.31)
bastnhestlime: 6 (step pos: 94 ngram dist: 0.06 pct same positions: 0.08)
aetehlisntbsm: 6 (step pos: 444 ngram dist: 0.05 pct same positions: 0.31)
asteblnsetmih: 6 (step pos: 207 ngram dist: 0.03 pct same positions: 0.38)
sbetalsimhetn: 6 (step pos: 15 ngram dist: 0.01 pct same positions: 0.15)
aethelisntbms: 6 (step pos: 414 ngram dist: 0.05 pct same positions: 0.31)
setbalsimhent: 6 (step pos: 6 ngram dist: 0.04 pct same positions: 0.38)
esnabtslheitm: 6 (step pos: 273 ngram dist: 0.03 pct same positions: 0.38)

Ngram distance sort

The words generated by the n-step method, sorted by ngram-distance (see above), showing the best 20 words. The ngram distance is also shown.
Also shows the edit distance and the position of the word in the generation of n-step anagrams.

setablishment: 0.71 (edit dist: 1 step pos: 1)
setablsihment: 0.21 (edit dist: 3 step pos: 3)
setbalsihment: 0.13 (edit dist: 5 step pos: 5)
batnsestlihme: 0.09 (edit dist: 5 step pos: 164)
bstaelshmniet: 0.08 (edit dist: 5 step pos: 42)
estasbnmltehi: 0.08 (edit dist: 7 step pos: 236)
bastnhestlime: 0.06 (edit dist: 6 step pos: 94)
aentlshebmsti: 0.06 (edit dist: 7 step pos: 317)
bstaeshlmneit: 0.06 (edit dist: 7 step pos: 45)
abshtenstlmei: 0.06 (edit dist: 7 step pos: 84)
atenheslismbt: 0.06 (edit dist: 6 step pos: 359)
seabsntlhmeti: 0.06 (edit dist: 7 step pos: 258)
aentlshesbmti: 0.06 (edit dist: 7 step pos: 320)
ensabltshetmi: 0.06 (edit dist: 5 step pos: 282)
bstaeslhmnite: 0.06 (edit dist: 7 step pos: 40)
bstaeshlmniet: 0.06 (edit dist: 7 step pos: 44)
bstaeslhmniet: 0.06 (edit dist: 7 step pos: 43)
aetehlisntbsm: 0.05 (edit dist: 6 step pos: 444)
aethelisntbms: 0.05 (edit dist: 6 step pos: 414)
abshtenlstemi: 0.05 (edit dist: 7 step pos: 80)
aetehlisntbms: 0.05 (edit dist: 6 step pos: 445)
ensablthsetmi: 0.05 (edit dist: 6 step pos: 283)

Shuffle method

The shuffle method just creates 500 random anagrams (permutations) based on the word. Here they are sorted by edit distance (see above), showing the first (best) 20 words
.

bltessaiemhnt: 6 (ngram distance: 0.04)
bestimlhsante: 6 (ngram distance: 0.05)
eamstbisthlen: 6 (ngram distance: 0.04)
saeteblshnitm: 6 (ngram distance: 0.03)
sblinshtmeate: 6 (ngram distance: 0.06)
tesmehablistn: 6 (ngram distance: 0.17)
amhsestnlietb: 6 (ngram distance: 0.05)
nbtehisalsmet: 6 (ngram distance: 0.03)
atestslmenihb: 6 (ngram distance: 0.08)
hesatisltmenb: 6 (ngram distance: 0.06)
aestlhebtismn: 6 (ngram distance: 0.05)
esiabltemnhts: 6 (ngram distance: 0.05)
estbsetinalmh: 6 (ngram distance: 0.04)
theablsetimns: 6 (ngram distance: 0.04)
abentehlissmt: 6 (ngram distance: 0.09)
hibestselntma: 7 (ngram distance: 0.05)
esmlbhnteaits: 7 (ngram distance: 0.03)
nbsestliathem: 7 (ngram distance: 0.05)
eessblntmhtia: 7 (ngram distance: 0.04)
lbtsenmeaisht: 7 (ngram distance: 0.06)
eestibnamstlh: 7 (ngram distance: 0.04)
ehantislmbets: 7 (ngram distance: 0.03)

The same shuffled words as above, but now sorted by ngrams distance. Showing the first (best) 20 words. Also the edit distance and ngram distance is shown.

tesmehablistn: 0.17 (edit distance: 6)
blasetthmenis: 0.10 (edit distance: 7)
blisenemastht: 0.10 (edit distance: 7)
abentehlissmt: 0.09 (edit distance: 6)
esttmensilabh: 0.09 (edit distance: 7)
ltaeenmishstb: 0.08 (edit distance: 8)
hlissemtabnet: 0.08 (edit distance: 7)
athmeiestlsbn: 0.08 (edit distance: 7)
tsemsinhtable: 0.08 (edit distance: 8)
atestslmenihb: 0.08 (edit distance: 6)
smebnhlisttae: 0.08 (edit distance: 8)
sttaehmnelisb: 0.08 (edit distance: 8)
silthesmenbta: 0.06 (edit distance: 8)
lbtsenmeaisht: 0.06 (edit distance: 7)
ibstathmneesl: 0.06 (edit distance: 7)
ibetahmlnests: 0.06 (edit distance: 7)
btsishnmetael: 0.06 (edit distance: 7)
liseatebsnthm: 0.06 (edit distance: 7)
hainteestblms: 0.06 (edit distance: 7)
sblinshtmeate: 0.06 (edit distance: 6)
hmbasentilste: 0.06 (edit distance: 8)
testmeahbsnli: 0.06 (edit distance: 8)

Back to my other useless programs
Back to my homepage
Created by Hakan Kjellerstrand hakank@bonetmail.com

Word:
Max steps:
Number to show (max 100):
Random word:	no yes
Language (for random word):	Swedish English