Re: SSD - TRIM command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



yeah =)
a question...
if i send a TRIM to a sector
if i read from it
what i have?
0x00000000000000000000000000000000000 ?
if yes, we could translate TRIM to WRITE on devices without TRIM (hard disks)
just to have the same READ information

2011/2/9 Piergiorgio Sartor <piergiorgio.sartor@xxxxxxxx>:
>> it´s just a discussion, right? no implementation yet, right?
>
> Of course...
>
>> what i think....
>> if device accept TRIM, we can use TRIM.
>> if not, we must translate TRIM to something similar (maybe many WRITES
>> ?), and when we READ from disk we get the same information
>
> TRIM is not about writing at all. TRIM tells the
> device that the addressed block is not anymore used,
> so it (the SSD) can do whatever it wants with it.
>
> The only software layer having the same "knowledge"
> is the filesystem, the other layers, do not have
> any decisional power about the block allocation.
> Except for metadata, of course.
>
> So, IMHO, a software TRIM can only be in the FS.
>
> bye,
>
> pg
>
>> the translation coulbe be done by kernel (not md) maybe options on
>> libata, nbd device....
>> other option is do it with md, internal (md) TRIM translate function
>>
>> who send trim?
>> internal md information: md can generate it (if necessary, maybe it´s
>> not...) for parity disks (not data disks)
>> filesystem/or another upper layer program (database with direct device
>> access), we could accept TRIM from filesystem/database, and send it to
>> disks/mirrors, when necessary translate it (internal or kernel
>> translate function)
>>
>>
>> 2011/2/9 Piergiorgio Sartor <piergiorgio.sartor@xxxxxxxx>:
>> > On Wed, Feb 09, 2011 at 04:30:15PM -0200, Roberto Spadim wrote:
>> >> nice =)
>> >> but check that parity block is a raid information, not a filesystem information
>> >> for raid we could implement trim when possible (like swap)
>> >> and implement a trim that we receive from filesystem, and send to all
>> >> disks (if it´s a raid1 with mirrors, we should sent to all mirrors)
>> >
>> > To all disk also in case of RAID-5?
>> >
>> > What if the TRIM belongs only to a single SDD block
>> > belonging to a single chunk of a stripe?
>> > That is a *single* SSD of the RAID-5.
>> >
>> > Should md re-read the block and re-write (not TRIM)
>> > the parity?
>> >
>> > I think anything that has to do with checking &
>> > repairing must be carefully considered...
>> >
>> > bye,
>> >
>> > pg
>> >
>> >> i don´t know what trim do very well, but i think it´s a very big write
>> >> with only some bits for example:
>> >> set sector1='00000000000000000000000000000000000000000000000000'
>> >> could be replace by:
>> >> trim sector1
>> >> it´s faster for sata communication, and it´s a good information for
>> >> hard disk (it can put a single '0' at the start of the sector and know
>> >> that all sector is 0, if it try to read any information it can use
>> >> internal memory (don´t read hard disk), if a write is done it should
>> >> write 0000 to bits, and after after the write operation, but it´s
>> >> internal function of hard disk/ssd, not a problem of md raid... md
>> >> raid should need know how to optimize and use it =] )
>> >>
>> >> 2011/2/9 Piergiorgio Sartor <piergiorgio.sartor@xxxxxxxx>:
>> >> >> ext4 send trim commands to device (disk/md raid/nbd)
>> >> >> kernel swap send this commands (when possible) to device too
>> >> >> for internal raid5 parity disk this could be done by md, for data
>> >> >> disks this should be done by ext4
>> >> >
>> >> > That's an interesting point.
>> >> >
>> >> > On which basis should a parity "block" get a TRIM?
>> >> >
>> >> > If you ask me, I think the complete TRIM story is, at
>> >> > best, a temporary patch.
>> >> >
>> >> > IMHO the wear levelling should be handled by the filesystem
>> >> > and, with awarness of this, by the underlining device drivers.
>> >> > Reason is that the FS knows better what's going on with the
>> >> > blocks and what will happen.
>> >> >
>> >> > bye,
>> >> >
>> >> > pg
>> >> >
>> >> >>
>> >> >> the other question... about resync with only write what is different
>> >> >> this is very good since write and read speed can be different for ssd
>> >> >> (hd don´t have this 'problem')
>> >> >> but i´m sure that just write what is diff is better than write all
>> >> >> (ssd life will be bigger, hd maybe... i think that will be bigger too)
>> >> >>
>> >> >>
>> >> >> 2011/2/9 Eric D. Mudama <edmudama@xxxxxxxxxxxxxxxx>:
>> >> >> > On Wed, Feb  9 at 11:28, Scott E. Armitage wrote:
>> >> >> >>
>> >> >> >> Who sends this command? If md can assume that determinate mode is
>> >> >> >> always set, then RAID 1 at least would remain consistent. For RAID 5,
>> >> >> >> consistency of the parity information depends on the determinate
>> >> >> >> pattern used and the number of disks. If you used determinate
>> >> >> >> all-zero, then parity information would always be consistent, but this
>> >> >> >> is probably not preferable since every TRIM command would incur an
>> >> >> >> extra write for each bit in each page of the block.
>> >> >> >
>> >> >> > True, and there are several solutions.  Maybe track space used via
>> >> >> > some mechanism, such that when you trim you're only trimming the
>> >> >> > entire stripe width so no parity is required for the trimmed regions.
>> >> >> > Or, trust the drive's wear leveling and endurance rating, combined
>> >> >> > with SMART data, to indicate when you need to replace the device
>> >> >> > preemptive to eventual failure.
>> >> >> >
>> >> >> > It's not an unsolvable issue.  If the RAID5 used distributed parity,
>> >> >> > you could expect wear leveling to wear all the devices evenly, since
>> >> >> > on average, the # of writes to all devices will be the same.  Only a
>> >> >> > RAID4 setup would see a lopsided amount of writes to a single device.
>> >> >> >
>> >> >> > --eric
>> >> >> >
>> >> >> > --
>> >> >> > Eric D. Mudama
>> >> >> > edmudama@xxxxxxxxxxxxxxxx
>> >> >> >
>> >> >> > --
>> >> >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> >> >> > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Roberto Spadim
>> >> >> Spadim Technology / SPAEmpresarial
>> >> >> --
>> >> >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >> >
>> >> > --
>> >> >
>> >> > piergiorgio
>> >> > --
>> >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> >> > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Roberto Spadim
>> >> Spadim Technology / SPAEmpresarial
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >
>> > --
>> >
>> > piergiorgio
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >
>>
>>
>>
>> --
>> Roberto Spadim
>> Spadim Technology / SPAEmpresarial
>
> --
>
> piergiorgio
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux