Friday, October 14, 2011

Amazing Nerdiness

I was listening to Tom Lehrer's "Silent E" (one of my favorites; I'm always happy when that pops up on my iPod), and decided to figure out how many pairs of words differ only by a trailing 'e'.

So I whipped up this nerdstrosity. I felt I had to share:

perl -e 'my %words = (); my @ewords = (); 
while (<>) {
    chomp();
    $words{$_} = 1; 
    push(@ewords, $_) if (/e$/);
} 
foreach my $eword ( @ewords ) { 
    (my $noe = $eword) =~ s/e$//; 
    if ( defined($words{$noe}) ) { 
        print "$noe -> $eword\n";  
    }  
}' /usr/share/dict/words

4 comments:

Anonymous said...

Well, how many?

Kevin said...

Too many loops:

my %words = ();
while (<>) {
chomp();
if (/e$/) {
my $eword = $_;
(my $noe = $_) =~ s/e$//;
if (defined($words{$noe})) {
print "$noe -> $_\n";
}
}
$words{$_} = 1;
}

rantingnerd said...

cnoocy: 461 from my Ubuntu dictionary. But not all of them are from silent 'e's. 348 non-capitalized words.

A bunch were like 'ax -> axe', and a number were latinate plurals (caesura -> caesurae) or like 'franchise -> franchisee'.

I did like 'butt -> butte' and 'rot -> rote'.

rantingnerd said...

Kevin: I didn't want to assume that the word list was already sorted!