[ILUG-BOM] Re: Duplicates in a txt file using perl

Philip S Tellis philip.tellis@[EMAIL-PROTECTED]
Tue Oct 9 00:44:03 IST 2001

Sometime on Oct 8, Vikram Ojha assembled some asciibets to say:

> i have attended regex seminar with Dinesh shah and other collegues
> there u said abt finding duplicates in a text file

Ok, before I answer your question, I must ask you to please not include
irrelevant mails in your posts.  You included my mail about X crashing,
which is totally irrelevant here.

Now, to your question.  This should work, using grep to check words on
one line:

grep -e "\<\([[:alpha:]]+\)\>[^[:alpha:]]+\1"

if you need to check for words across lines, then use sed:

sed -ne "h;n;x;G;/regex/p;x;"

This may require some additions, but I'll have to actually try it to
know for sure.

In perl, just slurp the entire file ($/=undef), and do a single line
match (m//s)


