On Fri, Aug 26, 2005 at 10:27:04AM -0700, Hua Zhong wrote: > 1. There is no lock timeout imeplemented (at least from API point of > view). Is this something you plan to add? Is there any draft of the > design? Yes, we will probably add timeouts (or take patches adding them.) We haven't thought much about the API, but one starting point would be the first version of the dlm that had timeouts: http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/?cvsroot=cluster > 2. With no lock timeout in DLM, how does upper-layer applications > (like GFS) implement such lock timeout (or does GFS also have no > timeout)? One thought is that it could has its own timer and when it > expires just cancel the pending lock. GFS does not use timeouts. > 3. How much does DLM do wrt deadlock detection? Especially when it > doesn't have timeout. Is it solely the application's responsibility to > detect it? GFS avoids deadlock in general by ordering its lock requests, but it cannot avoid conversion-deadlock. GFS depends on a special feature of the dlm to resolve conversion deadlocks: the flag DLM_LKF_CONVDEADLK permits the dlm to detect conversion deadlock and resolve it by demoting one of the locks involved. GFS is notified when this happens with the DLM_SBF_DEMOTED flag that's returned with the lock result. > 4. This is not a technical question. I'm trying to convince our > management to use DLM, so I'd like to know how stable it is and on > what kind of scale the Redhat clustering solution is being used by the > enterprise. Is it stable enough in production (data like average > uptime, etc)? The dlm on linux-kernel (and in -mm releases) is new code and is still in somewhat of a development stage, it's not been thoroughly tested yet. The first version of the dlm (url above) is currently a part of the RHEL4 gfs/cluster product. Dave -- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster