Re: Problems with SAMBA server on Centos 51 virtual xen guest with iSCSI SAN

Paolo Marini <paolom@xxxxxxxxxxxxx> · Tue, 08 Apr 2008 14:51:01 +0200

After some investigation, it seems that the problem is really related to 
samba and not to the cluster infrastructure which is working quite well.

Here some posting on the issue with samba, that was exploited with the 
upgrade to 3.0.25 included in the RH 5.1 update:

http://bugs.contribs.org/show_bug.cgi?id=3762
http://www.centos.org/modules/newbb/viewtopic.php?post_id=39829&topic_id=12152
https://bugzilla.redhat.com/show_bug.cgi?id=426244

What I did to solve the problem was to get the latest samba sources (3.0.28a) and rebuild the package updating the spec file. I commented out the patches from 115 onwards as they are already included in the samba 3.0.28a tarball.

After the upgrade, none of the problems mentioned by me and in the above reported links happened again.

Hope this helps other folks solve the same problem, and also convinces RH people to upgrade the sasmba package.

Paolo

John Ruemker ha scritto:
Paolo Marini wrote:
I have implemented a cluster of a few xen guest with a shared GFS 
filesystem residing on a SAN build with openfiler to support iSCSI 
storage.

Physical servers are 3 machines implementing a physical cluster, each 
one equipped with quad xeon and 4 G RAM. The network interface is 
based on channel bonding with LACP (on the physical hosts) having an 
aggregate of 2 gigabits ethernet per physical host, the switch 
supports LACP and has been configured accordingly.

Virtual servers are based on xen nodes on top of the physical server 
with shared storage on iSCSI and GFS.

The networking is based on a cluster private network (for cluster 
heartbeat and cluster communication + iSCSI) and an ethernet alias 
for the LAN to which the users are connected.

One of the cluster xen nodes is used for implementing a samba PDC (no 
failover of the service, plain samba, single samba server on the LAN) 
plus ldap server; samba works with ldap for users authentication. 
Storage for the samba server is on the SAN.

I continue to receive complaints from my users due to the fact that 
sometimes copying file generates errors, plus problems related to 
office usage (we still use the old Office 97 on some machines). The 
samba configuration is more or less the same as that correctly 
working on the previous physical machine, on which those problems 
were not present.

The problems generate these log entries on /var/log/samba/smbd:

[2008/04/02 19:00:50, 0] lib/util_sock.c:get_peer_addr(1232)
 getpeername failed. Error was Transport endpoint is not connected
[2008/04/02 19:05:32, 0] lib/util_sock.c:get_peer_addr(1232)
 getpeername failed. Error was Transport endpoint is not connected
[2008/04/02 19:05:32, 0] lib/util_sock.c:get_peer_addr(1232)
 getpeername failed. Error was Transport endpoint is not connected

And on the client machine log also on /var/log/samba

[2008/04/02 19:04:34, 0] lib/util_sock.c:read_data(534)
 read_data: read failure for 4 bytes to client 192.168.13.240. Error 
= Connection reset by peer
[2008/04/02 19:04:34, 1] smbd/service.c:close_cnum(1230)
 amhwq53p (192.168.13.240) closed connection to service tmp
[2008/04/02 19:04:34, 1] smbd/service.c:close_cnum(1230)
 amhwq53p (192.168.13.240) closed connection to service stock
[2008/04/02 19:04:34, 0] lib/util_sock.c:write_data(562)
 write_data: write failure in writing to client 192.168.13.240. Error 
Broken pipe
[2008/04/02 19:04:34, 0] lib/util_sock.c:send_smb(769)
 Error writing 75 bytes to client. -1. (Broken pipe)
[2008/04/02 19:04:34, 1] smbd/service.c:make_connection_snum(1033)

They seem similar to problems related to poor connectivity or problem 
in the network; however, these problems are new and were never found 
before switching to the clustered architecture. Also no problem have 
been found so far on the other xen nodes serving the same GFS 
filesystem (different dirs !) for NFS or other services.

Also putting the option

posix locking = no

on the smb.conf file did not help.

Any idea from someone else facing the same problems ?

thanks, Paolo

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster 
Those errors are explained in

    http://kbase.redhat.com/faq/FAQ_45_5274.shtm

John

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster