Removing duplicate mails
To sync my mail between computers I use offlineimap on a secure filesystem. Today I mistakenly ran offlineimap before mounting the secure filesystem, which caused it to duplicate all emails. Not wanting to do any manual work to fix this, I wrote a small Python 3 program that repaired the damage.
import mailbox import glob import os.path def dedupe(maildir): '''Removes duplicates from the given dir''' box = mailbox.Maildir(maildir, create=False) box.lock() seen = set() # Set of Message IDs we have seen delete = set() # Set of message keys to delete # Search for messages to delete for (key, message) in box.iteritems(): mid = message['Message-Id'] # If we have seen this Message ID before, # remember it for deletion if mid in seen: delete.add(key) seen.add(mid) # Delete the messages for key in delete: box.remove(key) box.close() # Iterate over all subdirectories as maildirs for dir in glob.glob('*'): if not os.path.isdir(dir): continue dedupe(dir)
I just put this here so I wouldn’t lose it. Perhaps you find some use for it too.