Thanks Jeff. It makes sense now. I did a test with DBT2 by turning the "full_page_write" on and off. The argument is set to "-d 200 -w 1 -c 10" for a short test. There is a 7 times difference in the number of pages written. When the option is on, 1066 pages are written; When the option is off, 158 pages are written; I agree with you that the name "full_page_write" is a little bit misleading. - Tian On Wed, May 25, 2011 at 5:59 PM, Jeff Davis <pgsql@xxxxxxxxxxx> wrote: > On Wed, 2011-05-04 at 00:17 -0400, Tian Luo wrote: >> So, "nbytes" should always be multiples of XLOG_BLCKSZ, which in the >> default case, is 8192. >> >> My question is, if it always writes full pages no matter >> "full_page_writes" is on or off, what is the difference? > > Most I/O systems and filesystems can end up writing part of a page (in > this case, 8192 bytes) in the event of a power failure, which is called > a "torn page". That can cause problems for postgresql, because the page > will be a mix of old and new data, which is corrupt. > > The solution is "full page writes", which means that when a data page is > modified for the first time after a checkpoint, it logs the entire > contents of the page (except the free space) to WAL, and can use that as > a starting point during recovery. This results in extra WAL data for > safety, but it's unnecessary if your filesytem + IO system guarantee > that there will be no torn pages (and that's the only safe time to turn > it off). > > So, to answer your question, the difference is that full_page_writes=off > means less total WAL data, which means fewer 8192-byte writes in the > long run (you have to test long enough to go through a checkpoint to see > this difference, however). PostgreSQL will never issue write() calls > with 17 bytes, or some other odd number, regardless of the > full_page_writes setting. > > I can see how the name is slightly misleading, but it has to do with > whether to write this extra information to WAL (where "extra > information" happens to be "full data pages" in this case); not whether > to write the WAL itself in full pages. > > Regards, > Â Â Â ÂJeff Davis > > -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general