I’m currently having a look into the debugging posibilities of my SLES11SP3 setup with pacemaker and ocfs2. The problem I ran into is, that the manpage of ocfs2 mentions in the “DLM Debugging” section the sysfs path “/sys/kernel/debug/o2dlm//dlm_state” to be usefull, but the folder “o2dlm” is missing on my nodes.
Furthermore, using debugfs.ocfs2 the “dlm_locks” command shows the following output (the filesystem is mounted!)
debugfs: dlm_locks <M0000000000000000ac......
Could not open debug state for "C63DB913D6......".
Perhaps that OCFS2 file system is not mounted?
To me this looks somehow related, though I’m not sure.
Does anyone have an idea how to enable this o2dlm in sysfs?
I’m currently having a look into the debugging posibilities of my SLES11SP3 setup with pacemaker and ocfs2. The problem I ran into is, that the manpage of ocfs2 mentions in the “DLM Debugging” section the sysfs path “/sys/kernel/debug/o2dlm//dlm_state” to be usefull, but the folder “o2dlm” is missing on my nodes.[/QUOTE]
Shouldn’t that be “/sys/kernel/debug/o2dlm//dlm_state”?
But the root cause could be that you might not be using o2dlm at all - I assume you’ve configured the cluster glue to use Pacemaker’s DLM, the typical scenario when running OCFS2 together with Pacemaker.
thanks for this hint. Somehow I didn’t realize that these are separate implementations. I stick with my current “/sys/kernel/debug/dlm” for debugging then
Are you aware of any documentation about the structure/data shown in the provided files of this folder/subfolders (*_locks, *_all, *_waiters …). I was able to identify some (pid, nodeid, lockres), but have no idea what the rest of it means.
no, unfortunately I’ve not seen any documentation yet - but I’ll ask around to see if I can find any pointer. (But don’t hold your breath…) Probably most of the details are only documented in “C”
That’s what I feared…
But maybe there’s another way to achive what I want to.
I’m looking for a way to query the ocfs2/dlm for all current locks including the nodes / pid’s which hold them and if applicable the queue of nodes lined up for the next available lock.
Where I’m currently stuck is, that the locking information is spread across all nodes of the cluster ( and origins debugfs / sysfs / … ) and it seems I have to collect it by hand. What I naively expected was, that there is one place in the debug output / debugfs … I could simply query for this information. I mean it has to be somewhere available for all nodes …
On Thu, 26 Jun 2014 07:14:02 +0000, jhaemmer wrote:
[color=blue]
Are you aware of any documentation about the structure/data shown in the
provided files of this folder/subfolders (*_locks, *_all, *_waiters
…). I was able to identify some (pid, nodeid, lockres), but have no
idea what the rest of it means.[/color]
Several of the HA developers hang out on this mailing list:
[QUOTE=jhaemmer;22219]I’m looking for a way to query the ocfs2/dlm for all current locks including the nodes / pid’s which hold them and if applicable the queue of nodes lined up for the next available lock.
Where I’m currently stuck is, that the locking information is spread across all nodes of the cluster ( and origins debugfs / sysfs / … ) and it seems I have to collect it by hand. What I naively expected was, that there is one place in the debug output / debugfs … I could simply query for this information. I mean it has to be somewhere available for all nodes …
Any hints where to start with in this case?[/QUOTE]
I’ve received a pointer to the according mailing list (http://lists.linux-ha.org/mailman/listinfo/linux-ha), you might try to place your question there, 'cause that’s where the developers hang around. Another path would be to open a service request with SUSE, if you have an according subscription. There are very HAE-knowledgeable engineers in the SUSE team that ought to be able to help out with such specific questions.