[BBLISA] rsync vs dump for in-use files?
Tom Metro
tmetro+bblisa at vl.com
Sun Sep 30 13:33:25 EDT 2007
Edward Ned Harvey wrote:
> I suppose you could attempt LVM snapshots, but that's crappy at best.
I haven't gone looking for yet, but I've yet to run across an account of
anyone actually using LVM snapshots. I'm sure its happening, but it
doesn't seem to be popular.
> ...or some other device with a file system more intelligent than
> EXT3, which is able to do filesystem snapshots...
Supposedly XFS supports snapshots, though I haven't looked into how they
are implemented.
ZFS has been recently discussed on this list. I recently read about
Btrfs[1], which is a ZFS-like file system developed by Oracle that's
being contributed to Linux, but neither is ready for production use on
Linux.
The previously mentioned Netapp is your best option if you need properly
implemented snapshots and can afford it. Solaris or FreeBSD using ZFS
would be the low cost route.
1. http://www.sdtimes.com/article/LatestNews-20070801-43.html
> Imagine you have a program, such as mysqld, which opens files read-write,
> and keeps them that way through the entire operation of the process. At no
> time is a complete file ever written, and at no time is the file ever
> closed. It is 100% impossible to backup that file, any more recently than
> it was opened. But filesystems that do snapshotting (ZFS, Netapp, some
> others) can at least allow you to backup the file as it was, just before
> the most recent time it was opened.
Are you sure that's how it works?
As far as a single file is concerned, inconsistencies come about due to
buffers not being flushed to disk, so you might have a transaction half
written to disk and half in the buffer. But it doesn't matter whether
the file is left open - you'll still be able to get what has been
written to disk since the file was opened.
The bigger problem that leads to the desired for snapshotting is
consistency among multiple files. It's easy to see how a database might
be writing related information to multiple files (like data to the
database file and transaction logging to another file), and in a
traditional backup time will have passed from the time when the first
file is copied and when the last file in the set is copied.
You can address single file inconsistency, whether you have snapshotting
or not, with cooperation of the application - having it flush its
buffers or momentarily close its files. Snapshotting helps by permitting
this application downtime to be kept to a minimum.
I believe for well designed applications, like a database with a binary
transaction log, if the log and DB file are captured simultaneously, it
doesn't matter if the buffers have been fully flushed on the DB file, as
the transaction log (which gets flushed frequently by design) will
correctly reflect the transactions that have been successfully written
to disk. So in this case snapshotting permits you to avoid any downtime,
even the few seconds it takes to make the snapshot.
> No matter what you do, if people (or processes) keep their files open
> indefinitely and never close, the most recent changes to that file are at
> risk.
What's still in memory, yes.
Fortunately UNIX-like systems don't have to jump through hoops to get
around file locking, which tends to thwart backups on Windows.
-Tom
--
Tom Metro
Venture Logic, Newton, MA, USA
"Enterprise solutions through open source."
Professional Profile: http://tmetro.venturelogic.com/
More information about the bblisa
mailing list