Re: Questions about Exofs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 15, 2012 at 6:06 PM, Boaz Harrosh <bharrosh@xxxxxxxxxxx> wrote:
> On 05/15/2012 05:22 PM, Idan Kedar wrote:
>
>> On Tue, May 15, 2012 at 4:42 PM, Boaz Harrosh <bharrosh@xxxxxxxxxxx> wrote:
> <snip>
>
>>> This should not be needed exofs is dependent on ore, dependent on raid456
>>
>
>> True, but when setting up the environment my goal was to have a
>> working environment. If a bug is introduced to this dependence system
>> and the module wouldn't be loaded, I would have to reluctantly spend
>> time finding the problem, when I can actually live with such a bug if
>> the workaround is simply loading the module manually. so as a policy,
>> I always explicitly load the modules I need when setting up a pNFS
>> environment.
>
>
> I'm sorry about that, that was my mess. My intention was to make all these
> dependencies totally transparent. As an implementation de-jur detail, not
> user visible.
>
> The reason I had it in script for a while was because in UML raid456.ko
> had a bug, which probing it manually gave a 20% better chance of success
> (Tree below has a patch that fixes that, on UML)
>
> Please note that some versions of exofs need it and some don't. I have
> removed it from my scripts, now.
>
> <snip>
>
>>> Are you sure you have an exofs FS at /mnt/pnfs ? please do an
>
>>> # df -h /mnt/pnfs I want to see ?
>
>>
>
>> # df -hT /mnt/pnfs/
>> Filesystem    Type    Size  Used Avail Use% Mounted on
>> /dev/osd0    exofs     55G   12G   44G  21% /mnt/pnfs
>>
>
>
> OK I understand that now, the weird single device case. I don't
> promise it will continue to work in Future. As a rule all devices
> most have a network-unique OSD_NAME.
No problem, I will change my scripts accordingly.
>
>
>>>
>>> Does it actually work? OK you know maybe it does. I can see this now.
>>> If you have a single device it might work, I didn't realize this. I thought
>>> the mkfs.exofs would not let you.
>>>
>>> For sure if you have more then one device (pnfs right?) then it will not
>>> let you, because the devices have strict order in the device table. Device
>>> names are not reliable and may change from login to login. You need some
>>> kind of device-id
>> Indeed it is a pNFS setup. And indeed I've had trouble when using more
>> than one OSD.
>
>
> I use a better script system for the cluster case, now, that I never pushed
> that makes all this easier.
>
> For one not setting osdname= at --format would explain the above.
>
>>
>> By the way, several weeks I have tried setting up a RAID 5 environment
>> with 8 OSDs, 1 mirror and RAID nesting. I then tried cloning and
>> compiling the kernel tree over this pNFS-OSD-RAID. The result was that
>> otgtd died, and I don't know why. It didn't dump core anywhere I could
>> find and the only "log" it has - stdout - didn't give any useful info.
>> I was going to inquire about this in a couple of weeks when I need to
>> get this environment working, but since this issue came up, maybe we
>> can somehow resolve it sooner.
>
>
> 8 OSDs with a mirror ? what was the mkfs.exofs command line you used?
Something along the lines of
# LD_LIBRARY_PATH=lib ./usr/mkfs.exofs --pid=0x10000 --format
--mirrors=1 --group_width=2 --group_depth=2 --dev=/dev/osd0
--osdname=$(uuid) --dev=/dev/osd1 --osdname=$(uuid) --dev=/dev/osd2
--osdname=$(uuid) ...

I don't remember exactly at the moment, but I will bump this thread
when I'll start using RAID again.
>
> And did you use one otgtd with 8 targets, or 8 targets (8 IP addresses)
> with one target each, or a combination?
one target with 8 LUNs
>
> What is the otgtd platform? what file system? what HW and HD environment?
osc-osd over ext4, 64 bit VirtualBox VM over x86_64.
>
> And yes otgtd has some instabilities.
>
> There are two I can think off:
> * Over xfs the --format command crashes the otgtd (aborted exit no
>  crash dump) Debugging welcome.
>
> * When lots of pnfs clients do heavy writing to the same otgtd, it
>  times-out and disconnects.
it was a single client performing git-clone of the kernel tree.
>  At Panasas we have a watch-dog that reloads it in a loop.
>  I have only seen this on FreeBSD, in Linux it never happened
>  to me.
>
> Please give me more details on what you did before it exited
> like that.
Nothing special, just git-clone. at some point it hanged (at a
different place every time), and when investigated a bit I saw that
otgtd is dead.
>
>
> In anyway I pushed a tree I tested with at:
>        git://git.open-osd.org/linux-open-osd.git
>
> checkout the *merge_and_compile-3.3* branch. But in principal they are the
> same:
>        fs/exofs                - Added autologin support
>        fs/nfs/objlayout        - Added autologin support
>        fs/nfsd                 - Same
>        fs/nfs                  - Few fixes that are in benny's tree are not in linux-open-osd
Thanks, I will try it soon.
>
> So it should all be the same. For a proper cluster setup you will probably
> need my do-ect scripts which take a cluster descriptor file and does
> generic loops on everything.
Please note that I didn't try a cluster setup, just a single DS with 8
LUNs, single MDS, and single pNFS client, all 3 different VMs on the
same host.
>
> Thanks
> Boaz



-- 
Idan Kedar
Tonian
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux