09:33 am, 26 Nov 04
removing duplicates using formail
An offlineimap hiccup from a long time ago put two copies of every message in my inbox. formail has a duplicate filter: it stores every message id in a cache file and when it sees a messageid twice, it returns success. If you add in the -s (split mail) flag, it will only output any message once.
So:
formail -D 10000000 cache -s cat < mbox > mbox2
Then to convince offlineimap to not recopy all of your local duplicates of messages back up to the server, rm your local copy of this mailbox and all mentions of it in ~/.offlineimap and its subdirectories. The next run will cause a full download of that mailbox.
(No work today, so it's mail-cleaning time: adding more data to the spamassassin bayes learner, removing duplicates, and next I'll try that bounce-filtering scheme y'all suggested.)
So:
formail -D 10000000 cache -s cat < mbox > mbox2
Then to convince offlineimap to not recopy all of your local duplicates of messages back up to the server, rm your local copy of this mailbox and all mentions of it in ~/.offlineimap and its subdirectories. The next run will cause a full download of that mailbox.
(No work today, so it's mail-cleaning time: adding more data to the spamassassin bayes learner, removing duplicates, and next I'll try that bounce-filtering scheme y'all suggested.)
I never did get bayes working. I think I'll go read up on that again.
i get a few "returned mail: user unknown" messages a day, from stupid mail servers that think the spammers "forging" my from: are actually mails from me.
the tactic to fixing it is to add an extra header to all outgoing mail, and then procmail away all bounce messages that don't contain the header. (bounce messages contain all the headers of the original message, y'know?)
ugh, i hate mail.
You must get scads of spam. Do you rely solely on the bayes filter, or do you have static rules as well? Any non-standard rulesets? I'm really interested in how you've got your mail server set up.