Discussion:
[Linux-cluster] DLM user API for lock value block
Jean-Marc Saffroy
2016-12-06 16:18:44 UTC
Permalink
Hi,

I am trying to use the DLM userland API (libdlm3), and while I was able to
do plain lock acquisitions and conversions, I am stuck trying to update
and then read the lock value block.

Does anyone have working examples of this? I did look at the rhdlmbook
doc, but couldn't fine one.

Attached is a messy test I wrote, which fails because it looks like
up-converting a lock with the LKF_VALBLK set doesn't seem to overwrite the
buffer I provide for the lock value block (and with strace it looks like
the kernel device returns the LVB on a down-conversion! weird). Example
output below.


Cheers,
Jean-Marc
--
***@gmail.com

$ make D=1
gcc -D_REENTRANT -Wall -Werror -O0 -g locklvb.c -pthread -ldlm
-lpthread -o locklvb

$ ./locklvb
dlm_kernel_version 6.0.1
create_lockspace
create_lockspace: Operation not permitted
open_lockspace
dlm_pthread_init
acquiring NL on MyLock...
LOCK mode -> NL convert 0
read_lvb 0 write_lvb 0
completion ast
entering loop on lock #1
count 0
LOCK mode -> PW convert 1
read_lvb 0 write_lvb 0
completion ast
init lvb => 51
lvb cache => 52
LOCK mode -> CR convert 1
read_lvb 0 write_lvb 1
completion ast
count 1
LOCK mode -> PW convert 1
read_lvb 1 write_lvb 0
completion ast
read lvb -1
locklvb: locklvb.c:177: do_lock: Assertion `lvb_lock.val >= 0' failed.
Aborted (core dumped)
David Teigland
2016-12-06 16:50:45 UTC
Permalink
Post by Jean-Marc Saffroy
Hi,
I am trying to use the DLM userland API (libdlm3), and while I was able to
do plain lock acquisitions and conversions, I am stuck trying to update
and then read the lock value block.
Does anyone have working examples of this? I did look at the rhdlmbook
doc, but couldn't fine one.
I haven't looked at your test to check if you're actually seeing this bug,
but you'll want this fix in any case:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/dlm/user.c?id=b96f465035f9fae83c1d8de3e80eecfe6877608c

In the following lvmlockd code, you can see an example of working around
that bug if you don't have immediate access to a newer kernel:

https://git.fedorahosted.org/cgit/lvm2.git/tree/daemons/lvmlockd/lvmlockd-dlm.c

There are some other random userland tests here that use lvbs:

https://fedorapeople.org/cgit/teigland/public_git/dct-stuff.git/tree/dlm

Dave
--
Linux-cluster mailing list
Linux-***@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
Jean-Marc Saffroy
2016-12-06 18:02:58 UTC
Permalink
Post by David Teigland
I haven't looked at your test to check if you're actually seeing this bug,
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/dlm/user.c?id=b96f465035f9fae83c1d8de3e80eecfe6877608c
That's definitely the issue I have.
Post by David Teigland
In the following lvmlockd code, you can see an example of working around
https://git.fedorahosted.org/cgit/lvm2.git/tree/daemons/lvmlockd/lvmlockd-dlm.c
Indeed, I have to work with not-so-recent distributions and their kernels,
so a workaround is much needed.

Adding a similar workaround in my test does help! But only with a single
process, because with more I quickly get a conversion deadlock error. :( I
will need to think more about this.

Thanks a lot for the pointers!


Cheers,
JM
--
***@gmail.com
--
Linux-cluster mailing list
Linux-***@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
Loading...