On Thu, Aug 23, 2007 at 04:20:41PM -0500, Eric Sandeen wrote: > I did a quick experiment, and it seems that by and large, gcc in f8 (gcc > version 4.1.2 20070821 (Red Hat 4.1.2-19)) is using slightly more stack > when building a kernel than is gcc on rhel5 (gcc version 4.1.1 20070105 > (Red Hat 4.1.1-52)) There are a few functions using less, too, though. > > After much checkstack output frobbing: > > function [module] : old : new : delta > -------------------------------------- > acpi_add_single_object [vmlinux]: 156 160 : 4 > acpi_cpufreq_cpu_init [acpi-cpufreq]: 136 140 : 4 > adi_connect [adi]: 144 148 : 4 > aes_decrypt [aes]: 140 144 : 4 > ahc_pci_config [aic7xxx]: 152 148 : -4 > ata_do_eh [libata]: 292 304 : 12 > aty128_probe [aty128fb]: 272 276 : 4 > balance_internal [reiserfs]: 148 160 : 12 > bond_arp_send_all [bonding]: 132 136 : 4 > capidtmf_recv_block [divacapi]: 188 192 : 4 > cciss_update_non_disk_devices [cciss]: 548 552 : 4 > cfb_copyarea [vmlinux]: 148 144 : -4 > check_balance [reiserfs]: 268 272 : 4 > copy_to_user_tmpl [vmlinux]: 400 404 : 4 > ctnetlink_new_expect [nf_conntrack_netlink]: 232 236 : 4 > decode_rs16 [reed_solomon]: 220 224 : 4 > diNewExt [jfs]: 112 108 : -4 > diva_add_card [divacapi]: 144 156 : 12 > do_balance [reiserfs]: 308 320 : 12 > do_cciss_request [cciss]: 544 548 : 4 > do_con_write [vmlinux]: 160 156 : -4 > do_tx [eni]: 120 124 : 4 > ea_dealloc_unstuffed [gfs2]: 116 120 : 4 > ehci_urb_enqueue [ehci-hcd]: 196 212 : 16 > ext4_expand_extra_isize_ea [ext4dev]: 128 132 : 4 > ext4_ext_insert_extent [ext4dev]: 144 148 : 4 > ext4_ext_remove_space [ext4dev]: 120 124 : 4 > facility_req [divacapi]: 396 392 : -4 > __fat_readdir [fat]: 252 256 : 4 > fat_search_long [fat]: 408 412 : 4 > fetch_frame [cpia]: 184 196 : 12 > ftdi_elan_status_work [ftdi-elan]: 252 248 : -4 > gdth_detect [gdth]: 440 436 : -4 > get_far_parent [reiserfs]: 140 152 : 12 > hfcpci_interrupt [hisax]: 356 360 : 4 > hptiop_probe [hptiop]: 152 156 : 4 > huft_build [vmlinux]: 152 144 : -8 > ieee80211_master_start_xmit [mac80211]: 176 180 : 4 > ieee80211_sta_work [mac80211]: 464 476 : 12 > i_ipmi_request [ipmi_msghandler]: 108 100 : -8 > inftl_scan_bbt [diskonchip]: 196 200 : 4 > ip_route_input [vmlinux]: 196 192 : -4 > ip_setsockopt [vmlinux]: 448 452 : 4 > isdn_tty_write [isdn]: 160 164 : 4 > ivtv_process_vbi_data [ivtv]: 116 120 : 4 > jfs_readdir [jfs]: 348 340 : -8 > key_schedule [cast5]: 608 596 : -12 > matroxfb_dh_set_par [matroxfb_crtc2]: 116 120 : 4 > matroxfb_ioctl [matroxfb_base]: 136 140 : 4 > mmc_blk_issue_rq [mmc_block]: 360 356 : -4 > module_verify_signature [vmlinux]: 124 120 : -4 > myri10ge_xmit [myri10ge]: 112 116 : 4 > nv_tx_timeout [forcedeth]: 156 144 : -12 > ocfs2_write_cluster_by_desc [ocfs2]: 112 108 : -4 > os_scsi_tape_open [osst]: 180 184 : 4 > paging32_page_fault [kvm]: 116 108 : -8 > parse_audio_unit [snd-usb-audio]: 132 144 : 12 > patch_cmi9880 [snd-hda-intel]: 108 124 : 16 > pbus_size_mem [vmlinux]: 124 128 : 4 > pkt_open [pktcdvd]: 556 560 : 4 > prism2_plx_probe [hostap_plx]: 116 136 : 20 > qla1280_nvram_config [qla1280]: 128 160 : 32 > r300_do_cp_cmdbuf [radeon]: 580 572 : -8 > radeon_check_modes [radeonfb]: 204 212 : 8 > radeon_get_pllinfo [radeonfb]: 176 192 : 16 > s2io_add_isr [s2io]: 176 184 : 8 > savage_dispatch_draw [savage]: 304 308 : 4 > sd_revalidate_disk [sd_mod]: 156 116 : -40 > search_by_key [reiserfs]: 256 272 : 16 > send_s870 [atp870u]: 100 104 : 4 > service_interrupt [atmel]: 188 192 : 4 > _snd_emu10k1_audigy_init_efx [snd-emu10k1]: 160 164 : 4 > snd_emu10k1_init_efx [snd-emu10k1]: 136 172 : 36 > snd_intel8x0_probe [snd-intel8x0]: 156 148 : -8 > snd_mixart_hw_params [snd-mixart]: 268 272 : 4 > snd_pcm_common_ioctl1 [snd-pcm]: 272 276 : 4 > snd_pcm_hw_refine [snd-pcm]: 160 168 : 8 > snd_pcm_oss_change_params [snd-pcm-oss]: 168 176 : 8 > snd_usb_create_midi_interface [snd-usb-lib]: 112 116 : 4 > sr_probe [sr_mod]: 140 144 : 4 > start_preview [saa7134]: 212 232 : 20 > st_ioctl [st]: 128 144 : 16 > stv680_newframe [stv680]: 176 192 : 16 > svcauth_gss_accept [auth_rpcgss]: 216 208 : -8 > sys_copyarea [syscopyarea]: 144 140 : -4 > sys_init_module [vmlinux]: 232 236 : 4 > tcp_sendmsg [vmlinux]: 128 120 : -8 > tcp_v4_do_rcv [vmlinux]: 104 108 : 4 > tulip_init_one [tulip]: 136 140 : 4 > tveeprom_hauppauge_analog [tveeprom]: 156 164 : 8 > txCommit [jfs]: 168 180 : 12 > udf_get_block [udf]: 412 416 : 4 > udf_get_filename [udf]: 584 580 : -4 > write_filehandle [nfsd]: 208 204 : -4 > xfs_alloc_delrec [xfs]: 132 144 : 12 > xfs_bmap_btalloc [xfs]: 172 180 : 8 > xfs_bmbt_delrec [xfs]: 212 216 : 4 > xfs_bmbt_newroot [xfs]: 132 140 : 8 > xfs_da_do_buf [xfs]: 188 192 : 4 > xfs_inobt_delrec [xfs]: 148 152 : 4 > xfs_inobt_newroot [xfs]: 172 176 : 4 > xtSearch [jfs]: 104 116 : 12 > xtUpdate [jfs]: 404 424 : 20 > zd1201_usbrx [zd1201]: 100 108 : 8 > ------------------------ > Functions increased: 79 > Functions decreased: 25 > Net change: +432 bytes > > > that starts with checkstack output, so it's only checking functions > using > 100 bytes of stack, and then the above is only showing functions > with changed stack usage... but from a spot-check, smaller stack-users > are affected as well. > > With 4KSTACKS on x86, I'm afraid this could add up to more problems. > > Any idea what might be causing this? At least for the 4 testcases you've mailed me, there are two different causes: 1) http://gcc.gnu.org/PR30364 2) http://gcc.gnu.org/PR30931 E.g. snd_emu10k1_init_efx growth is caused by 1), while qla1280_nvram_config by 2), the other two I believe are caused by both together. PR30364 is a fix for when -fwrapv isn't used (but that option on the other hand can pessimize loops), it is unsafe to reassociate additions/subtractions if the type doesn't have defined overflow behavior. Say with int a, b, (a - 20) + (b - 20) is unsafe to reassociate into a + b - 40, because for certain values of a and b the former wouldn't overflow while the latter will. In the qla1280_nvram_config case this is with pointers which at least as richi wrote the patch are considered also to have undefined overflow behavior. But gcc 4.1.x/4.2.x (unlike the trunk) still reassociate e.g. struct A { unsigned int a, b, c; }; struct B { struct A d[12]; }; struct A * foo (struct B *x, int y) { return &x->d[y - 8]; } Jakub _______________________________________________ Fedora-kernel-list mailing list Fedora-kernel-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-kernel-list