Recovering from a bad Kerio Mailserver crash


2012/07/24

Recently a large (over 900 users) Kerio Connect customer suffered a very bad RAID failure. The initial symptoms were that the server began running very slowly, but then we started noticing instances of data corruption also. For example, there were empty eml files in the queue and some necessary files were also missing. There were log entries about problems in users mailboxes - the system was obviously failing. The decision was made to bring up a new server.

However, because of the failing RAID, they couldn't do what you'd normally do. Unable to quickly transfer the Store directory or restore it from backup with kmsrecover, they decided to bring up an "empty" server - that is, just transfer the configuration files and point them at an empty Store directory.

That would at least restore basic functionality. Users would be able to receive and send email, though their old mail would be (temporarily) unavailable.

Recovery

Of course eventually they would want to recover the old store. It was Friday afternoon when they put up the new server; they expected to be able to have the old store data available by early Monday AM.

This creates a bit of a problem, though. If the new Store is (for example) at E:NEWSTORE, you cant' plop the old store down on top of that without losing all the email that came in after Friday. It's not feasible to just copy .eml files from the new store to the (recovered) old after repointing the server, because the individual messages could have name clashes: you could easily have a "00000001.eml" file in a user's INBOX in both the old store and the new - copying would overwrite one.

Kerio does provide a simple solution for this problem. It's not perfect, but it is workable. Basically, you can copy eml files to a users root directory in the store.

So, given a user "tony":

copy E:/NEWSTORE/mail/xyz.org/tony/INBOX/#msgs/*    F:/STORE/mail/xyz.org/tony/
 

Note that you are NOT copying to the old inbox - the server will notice the messages in F:/STORE/mail/xyz.org/tony/ and merge them in correctly.

You'd do that AFTER the mail server was happily pointed at its recovered Store. There's no need to stop the server.

Other issues

However, if users have done work during the recovery period, you have a more difficult job. For example, a user may want copies of their Sent Items.

For that, you'd need to copy the new eml's to Sent Items, watching out for overwrites because of file name conflicts. You'd then need to tell the server to rebuild the users index.

It is best that any smart phone access, Outlook, webmail etc. to the mailboxes is not running when you do this. Ideally you'd stop the server before doing any of it but if you get the particular user completely logged out and rebuild the indexes, it should be OK while running. The potential problem here is caching and synchronization confusion.

Caveats

Something to keep in mind if you have to copy a Store directory elsewhere: ownerships, permissions and time stamps need to be retained. Additionally, there are a number of "." files - files that begin with a period. These need to be copied also. Under Linux and Mac OS X, you won't see those files unless you choose "ls -a", but they need to be copied. Using tools like rsync, use the "-a" (archive) flag and copy by directory, not wildcards. Do "rsync -a thisdir thatplace", not "rsync -a thisdir/* thatplace/thisdir"

There is also a "settings" directory that can contain important files. This is just under the directory that contains your config files: /opt/kerio/mailserver/settings on Linux. One thing that could be there is a .pop3.db file that maintains knowledge of pop3 downloads (if you use that). Settings is likely empty, but check.



Got something to add? Send me email.





(OLDER) <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> -> Recovering from a bad Kerio Mailserver crash


2 comments



Increase ad revenue 50-250% with Ezoic


More Articles by

Find me on Google+

© Anthony Lawrence







Mon Sep 17 04:46:03 2012: 11324   Rick

gravatar


An "Oops! look what happened there!" effect of dropping mail into a users top level folder for Kerio to sort, is that all of that mail will then have the date and time of the "drop", (or more precisely, the sort) and not the original timestamp. If that isn't an issue though, this is a marvelous technique.





Mon Sep 17 09:31:07 2012: 11325   TonyLawrence

gravatar


Yes, that is true. In the case of a mess like the one described above, it was acceptable.

------------------------
Kerio Connect Mailserver

Kerio Samepage

Kerio Control Firewall

Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us





Actually I made up the term "object-oriented", and I can tell you I did not have C++ in mind. (Alan Kay)

The primary duty of an exception handler is to get the error out of the lap of the programmer and into the surprised face of the user. (Verity Stob)







This post tagged: