Discussion:
Swap on ZFS
David Noel
2014-05-06 19:21:43 UTC
Permalink
Is swap on ZFS still ill-advised? All the forum, list, and blog posts
I find say it's a no-go. Is this still the case? The idea behind it
not working is that ZFS needs memory to write to disk, so when you
need to swap (are low on memory) ZFS won't be able to write.

I found some talk of having a tunable added as a workaround that would
reserve a certain amount of memory for ZFS so this wouldn't be a
problem, but have no idea if anyone's made any progress towards
implementing it.
Daniel Staal
2014-05-06 23:01:13 UTC
Permalink
Post by David Noel
Is swap on ZFS still ill-advised? All the forum, list, and blog posts
I find say it's a no-go. Is this still the case? The idea behind it
not working is that ZFS needs memory to write to disk, so when you
need to swap (are low on memory) ZFS won't be able to write.
I found some talk of having a tunable added as a workaround that would
reserve a certain amount of memory for ZFS so this wouldn't be a
problem, but have no idea if anyone's made any progress towards
implementing it.
--As for the rest, it is mine.

I haven't seen anything official, and it hasn't been that long (a couple of
months - I don't think I've seen any major updates since that didn't have
to do with OpenSSL) since I had a problem that locked up my machine for a
day before I could fix it and give it some *other* swap, so I'll say: Yes.

That said, if you don't use swap heavily it works: I didn't have trouble
for several years before I started adding to the box's load and ran it out
of RAM. Just make sure you have enough RAM to cover all normal
requirements, and that extreme cases aren't very common or very extreme.
Otherwise, give it a small dedicated swap drive/partition. (I put it on
the same drive the ZIL - the ZIL will have a max defined size dependent on
your RAM, so there was plenty of space on the SSD that would never be used.)

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author. Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes. This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------
Rolf Nielsen
2014-05-07 00:25:47 UTC
Permalink
Post by David Noel
Is swap on ZFS still ill-advised? All the forum, list, and blog
posts I find say it's a no-go. Is this still the case? The idea
behind it not working is that ZFS needs memory to write to disk, so
when you need to swap (are low on memory) ZFS won't be able to
write.
I found some talk of having a tunable added as a workaround that
would reserve a certain amount of memory for ZFS so this wouldn't
be a problem, but have no idea if anyone's made any progress
towards implementing it.
Just out of curiosity, why do you want it?

To get swap on ZFS, you first need to create a ZFS filesystem on one
or more devices, then you create a dedicated volume inside that
filesystem and use that dedicated volume as swap. To me that seems to
add unnecessary complexity, similar to using a file backed md device
as swap. Please don't take this as criticism; you may very well have
good reasons for wanting to do this. I'm just curious about those reasons.

Rolf Nielsen
Daniel Staal
2014-05-07 00:56:48 UTC
Permalink
Post by Rolf Nielsen
Post by David Noel
Is swap on ZFS still ill-advised? All the forum, list, and blog
posts I find say it's a no-go. Is this still the case? The idea
behind it not working is that ZFS needs memory to write to disk, so
when you need to swap (are low on memory) ZFS won't be able to
write.
I found some talk of having a tunable added as a workaround that
would reserve a certain amount of memory for ZFS so this wouldn't
be a problem, but have no idea if anyone's made any progress
towards implementing it.
Just out of curiosity, why do you want it?
To get swap on ZFS, you first need to create a ZFS filesystem on one
or more devices, then you create a dedicated volume inside that
filesystem and use that dedicated volume as swap. To me that seems to
add unnecessary complexity, similar to using a file backed md device
as swap. Please don't take this as criticism; you may very well have
good reasons for wanting to do this. I'm just curious about those reasons.
--As for the rest, it is mine.

Because it's actually simpler than the alternative, in many cases. The
creating a ZFS filesystem is 'free' in this - you are only planning on
doing this if you are already running a ZFS-based system, so you're already
creating the filesystem. That leaves creating the dedicated volume inside
that and using it as swap - which is as easy or easier than formatting and
using a dedicated disk as swap.

So, really using swap on ZFS is no harder or easier than using a dedicated
swap disk, and no more complex. And if you *aren't* planning on a
dedicated swap disk, it starts adding complexity: If you aren't using a
dedicated swap disk, then you're probably sharing it with a disk that
you'll be using in the ZFS filesystem - which means you now need to format
and partition that disk, which you didn't need to do before. You also have
to monitor and remember that the disk is partitioned, if you ever have to
replace it. (Which otherwise ZFS would make easy - just swap in a new one,
and tell ZFS to use it to replace the failed disk.)

So your steps are:
1. Create ZFS filesystem.
2. Create swap inside filesystem.
3. Configure FreeBSD to use swap.

vs.
1. Partition Disks.
2. Set up Swap partition.
3. Configure FreeBSD to use swap.
4. Create ZFS filesystem on other partition.

Note of course that one of the points of using ZFS is the ease and
flexibility of creating volumes inside it - a ZFS user is probably creating
multiple at setup, and the swap volume isn't all that different to create.
And again, you're giving up the ability to use ZFS to manage the device on
the fly, which is one of ZFS's best benefits.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author. Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes. This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------
Rolf Nielsen
2014-05-07 04:20:49 UTC
Permalink
Post by Daniel Staal
Post by Rolf Nielsen
Post by David Noel
Is swap on ZFS still ill-advised? All the forum, list, and
blog posts I find say it's a no-go. Is this still the case? The
idea behind it not working is that ZFS needs memory to write to
disk, so when you need to swap (are low on memory) ZFS won't be
able to write.
I found some talk of having a tunable added as a workaround
that would reserve a certain amount of memory for ZFS so this
wouldn't be a problem, but have no idea if anyone's made any
progress towards implementing it.
Just out of curiosity, why do you want it?
To get swap on ZFS, you first need to create a ZFS filesystem on
one or more devices, then you create a dedicated volume inside
that filesystem and use that dedicated volume as swap. To me that
seems to add unnecessary complexity, similar to using a file
backed md device as swap. Please don't take this as criticism;
you may very well have good reasons for wanting to do this. I'm
just curious about those reasons.
--As for the rest, it is mine.
Because it's actually simpler than the alternative, in many cases.
The creating a ZFS filesystem is 'free' in this - you are only
planning on doing this if you are already running a ZFS-based
system, so you're already creating the filesystem. That leaves
creating the dedicated volume inside that and using it as swap -
which is as easy or easier than formatting and using a dedicated
disk as swap.
So, really using swap on ZFS is no harder or easier than using a
dedicated swap disk, and no more complex. And if you *aren't*
planning on a dedicated swap disk, it starts adding complexity: If
you aren't using a dedicated swap disk, then you're probably
sharing it with a disk that you'll be using in the ZFS filesystem -
which means you now need to format and partition that disk, which
you didn't need to do before. You also have to monitor and
remember that the disk is partitioned, if you ever have to replace
it. (Which otherwise ZFS would make easy - just swap in a new one,
and tell ZFS to use it to replace the failed disk.)
So your steps are: 1. Create ZFS filesystem. 2. Create swap inside
filesystem. 3. Configure FreeBSD to use swap.
vs. 1. Partition Disks. 2. Set up Swap partition. 3. Configure
FreeBSD to use swap. 4. Create ZFS filesystem on other partition.
Note of course that one of the points of using ZFS is the ease and
flexibility of creating volumes inside it - a ZFS user is probably
creating multiple at setup, and the swap volume isn't all that
different to create. And again, you're giving up the ability to use
ZFS to manage the device on the fly, which is one of ZFS's best
benefits.
Daniel T. Staal
---------------------------------------------------------------
This email copyright the author. Unless otherwise noted, you are
expressly allowed to retransmit, quote, or otherwise use the
contents for non-commercial purposes. This copyright will expire 5
years after the author's death, or in 30 years, whichever is
longer, unless such a period is in excess of local copyright law.
---------------------------------------------------------------
_______________________________________________
http://lists.freebsd.org/mailman/listinfo/freebsd-questions To
unsubscribe, send any mail to
I'm not referring to the complexity of doing the setup, but to the
added layer (a volume on a filesystem on a disk vs. a partition on a
disk). I use ZFS for data storage, but I don't use any zvols. And my /
is on UFS on a "dangerously dedicated" 40GB SSD that also has the swap
partition.

If I want to talk to my mother, I call my mother and talk to her. I
don't call my sister and have her call my mother and relay everything.
And for the same reason, I don't see why I should put a filesystem or
swap on a volume on a filesystem.

Rolf
Andrew Berg
2014-05-07 04:37:55 UTC
Permalink
Post by Rolf Nielsen
If I want to talk to my mother, I call my mother and talk to her. I
don't call my sister and have her call my mother and relay everything.
And for the same reason, I don't see why I should put a filesystem or
swap on a volume on a filesystem.
By that logic, you should talk to her in person and not relay your voice
over the phone.

zvols are far more flexible than partitions and have the added benefit
of COW (cheap snapshots and clones anyone?) and checksums
underneath. Instantly getting more space in your zpool by cutting
down unneeded swap would be quite nice.
Rolf Nielsen
2014-05-07 05:24:06 UTC
Permalink
Post by Andrew Berg
Post by Rolf Nielsen
If I want to talk to my mother, I call my mother and talk to her.
I don't call my sister and have her call my mother and relay
everything. And for the same reason, I don't see why I should put
a filesystem or swap on a volume on a filesystem.
By that logic, you should talk to her in person and not relay your
voice over the phone.
Unfortunately she lives 450 kilometres away.
Post by Andrew Berg
zvols are far more flexible than partitions and have the added
benefit of COW (cheap snapshots and clones anyone?) and checksums
underneath. Instantly getting more space in your zpool by cutting
down unneeded swap would be quite nice.
What would be the point of COW, snapshots and clones of swap?
Andrew Berg
2014-05-07 05:27:16 UTC
Permalink
Post by Rolf Nielsen
What would be the point of COW, snapshots and clones of swap?
None, but the flexibility wrt to size is what people want. Start big,
track usage, and resize as desired.
krad
2014-05-07 09:49:46 UTC
Permalink
the check summing of swap is quite a big deal in my opinion. You have your
nice big server with ECC ram, but you do need to page some stuff in and
out. Now you go and get some corruption on your disks, which you then feed
back into your ram making the ecc irrelevant.
Post by Andrew Berg
Post by Rolf Nielsen
What would be the point of COW, snapshots and clones of swap?
None, but the flexibility wrt to size is what people want. Start big,
track usage, and resize as desired.
_______________________________________________
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "
David Noel
2014-05-12 12:01:17 UTC
Permalink
Post by krad
the check summing of swap is quite a big deal in my opinion. You have your
nice big server with ECC ram, but you do need to page some stuff in and
out. Now you go and get some corruption on your disks, which you then feed
back into your ram making the ecc irrelevant.
Now that, I think, is a very interesting point, and one I hadn't
considered. I see a great deal of value in having checksumming enabled
on a swap volume/partition. What is the default behavior when I create
a freebsd-swap partition with gpart, then hand it to the OS with
swapon? Is there any sort of error correction mechanism in place? That
alone would make a very strong case for swap on ZFS.

However, I don't see any way around the rumored issue of the system
hanging every time all my RAM is in use and I need to hit swap. Is
this still the default behavior? The wiki says yes. I suppose I could
test it myself fairly easily. But what would the solution be? If I
test it and it hangs I'd like to be able to suggest a solution in the
PR. Would adding a tunable that allowed me to reserve x MB for ZFS be
the solution? Out of curiousity, what would that value need to be set
to? How much memory does ZFS need available to write to disk?
Daniel Staal
2014-05-16 16:35:31 UTC
Permalink
Post by David Noel
However, I don't see any way around the rumored issue of the system
hanging every time all my RAM is in use and I need to hit swap. Is
this still the default behavior? The wiki says yes. I suppose I could
test it myself fairly easily. But what would the solution be? If I
test it and it hangs I'd like to be able to suggest a solution in the
PR. Would adding a tunable that allowed me to reserve x MB for ZFS be
the solution? Out of curiousity, what would that value need to be set
to? How much memory does ZFS need available to write to disk?
--As for the rest, it is mine.

Having hit it a couple of times, I'd say it's a bit more complicated than
that description, but yeah, that's still the behavior from my experience.
Failure can be more complicated than 'hang' though, and triggering it can
be a bit hit-or-miss.

A tunable like that sounds like a good solution. I'm not sure what the
value would need to be set at though. I'm not even sure if it's a constant
or dependent on blocksize and other factors. I'm pretty sure the memory
needed per block is a defined amount someplace, but ZFS likes to group IO
where possible, so a 'safe' amount might vary by the amount of writes it
can have queued at any particular moment. (Which is a tunable.)

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author. Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes. This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------

Daniel Staal
2014-05-07 20:35:49 UTC
Permalink
Post by Andrew Berg
Post by Rolf Nielsen
If I want to talk to my mother, I call my mother and talk to her. I
don't call my sister and have her call my mother and relay everything.
And for the same reason, I don't see why I should put a filesystem or
swap on a volume on a filesystem.
By that logic, you should talk to her in person and not relay your voice
over the phone.
zvols are far more flexible than partitions and have the added benefit
of COW (cheap snapshots and clones anyone?) and checksums
underneath. Instantly getting more space in your zpool by cutting
down unneeded swap would be quite nice.
--As for the rest, it is mine.

ZFS also adds resiliency, in most use cases. Yes, you can set up normal
swap to use redundant disks, but I'm not sure how off the top of my head,
while it's one of the base features of ZFS. I don't know what happens when
a swap disk fails in use, but I suspect it isn't pretty. With ZFS you may
not even have to shut the machine down to replace the disk.

(And of course you can go the *other* way: Add swap for a particular
situation, if needed. Even just for running a single job.)

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author. Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes. This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------
Continue reading on narkive:
Loading...