Evan Martin (evan) wrote in evan_tech,
Evan Martin
evan
evan_tech

removing duplicates using formail

An offlineimap hiccup from a long time ago put two copies of every message in my inbox. formail has a duplicate filter: it stores every message id in a cache file and when it sees a messageid twice, it returns success. If you add in the -s (split mail) flag, it will only output any message once.
So:
formail -D 10000000 cache -s cat < mbox > mbox2

Then to convince offlineimap to not recopy all of your local duplicates of messages back up to the server, rm your local copy of this mailbox and all mentions of it in ~/.offlineimap and its subdirectories. The next run will cause a full download of that mailbox.

(No work today, so it's mail-cleaning time: adding more data to the spamassassin bayes learner, removing duplicates, and next I'll try that bounce-filtering scheme y'all suggested.)
Subscribe

  • memcache job offers

    I get occasional recruiter spam that specifically calls out "my work on memcached". This is pretty funny because all I did was make some trivial…

  • application stack

    "Put yourself in 1995. I'm going to tell the you of 1995 that in 2010, there will be a software platform with the following properties:" Luis Villa…

  • münchen

    On that note: I'm living in Munich for the next week plus a few days. Do I know anyone around here? (PS: The LJ → PubSubHubbub → Reader…

  • Post a new comment

    Error

    default userpic
    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 3 comments