On Wed, Jun 04, 2014 at 12:44:07PM +0200, Giuseppe Scrivano wrote: > "Daniel P. Berrange" <berrange@xxxxxxxxxx> writes: > > >> This series was applied to the libosinfo source. > >> > >> function total time spent in the function > >> > >> osinfo_loader_get_device 32.83% > >> osinfo_db_get_device 2.96% > >> osinfo_db_add_device 4.50% > >> osinfo_db_new 24.97% > >> > >> so creating a new GObject is very slow. I thought about creating just a > >> model object and then clone it instead of going trough the gobject > >> constructors for every instance. Does it sound like a good idea? (I am > >> not still sure if g_type_create_instance will be enough to achieve this, > >> haven't looked into details). > > > > What's the actual wallclock time this demo takes to run for you ? > > > > For me it is a fraction of a second, which seems pretty fast and shouldn't > > be noticable by the user. > > > > $ time ./demo > > > > real 0m0.238s > > user 0m0.218s > > sys 0m0.014s > > It takes approximately the same time for me as well. Indeed, it is not > much time but it seems noticeable in the upstream version of > virt-manager, where the UI seems a bit slower compared to older versions > without libosinfo when the data is accessed for the first time and I > preferred to optimize this in libosinfo instead of a workaround in > virt-manager. > Anyway, this series cuts enough wallclock time that this effect is much > less noticeable now that shouldn't be a problem for virt-manager users > too. FYI before applying this series, oprofile reports 489780 100.000 demo CPU_CLK_UNHALT...| samples| %| ------------------ 173292 35.3816 libc-2.18.so 94375 19.2689 libxml2.so.2.9.1 90366 18.4503 libglib-2.0.so.0.4000.0 50390 10.2883 libgobject-2.0.so.0.4000.0 29642 6.0521 libosinfo-1.0.so.0.2.10 24293 4.9600 no-vmlinux 16796 3.4293 libgio-2.0.so.0.4000.0 10554 2.1548 libpthread-2.18.so 37 0.0076 ld-2.18.so 35 0.0071 [vdso] (tgid:23223 range:0x7fff939fe000-0x7fff939fffff) After applying it we get 355391 100.000 demo CPU_CLK_UNHALT...| samples| %| ------------------ 106098 29.8539 libc-2.18.so 88024 24.7682 libglib-2.0.so.0.4000.0 51930 14.6121 libgobject-2.0.so.0.4000.0 29712 8.3604 libosinfo-1.0.so.0.2.10 26496 7.4555 libxml2.so.2.9.1 25416 7.1516 no-vmlinux 17085 4.8074 libgio-2.0.so.0.4000.0 10546 2.9674 libpthread-2.18.so 45 0.0127 [vdso] (tgid:31109 range:0x7ffffb3e9000-0x7ffffb3eafff) 37 0.0104 ld-2.18.so 1 2.8e-04 demo 1 2.8e-04 libpng16.so.16.6.0 So we've clearly improved the XML parsing by a large degree. Looking at the per-symbol breakdown now we see Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 90000 samples % image name symbol name 25416 7.2170 no-vmlinux /no-vmlinux 24880 7.0648 libc-2.18.so _int_malloc 24609 6.9878 libc-2.18.so malloc 16643 4.7258 libc-2.18.so _int_free 12653 3.5929 libglib-2.0.so.0.4000.0 g_mutex_get_impl 12036 3.4177 libosinfo-1.0.so.0.2.10 osinfo_entity_get_type 10886 3.0911 libglib-2.0.so.0.4000.0 g_private_get_impl 9844 2.7952 libc-2.18.so vfprintf 9441 2.6808 libgobject-2.0.so.0.4000.0 g_type_value_table_peek 6839 1.9420 libglib-2.0.so.0.4000.0 g_hash_table_lookup 6695 1.9011 libglib-2.0.so.0.4000.0 g_str_hash 6213 1.7642 libglib-2.0.so.0.4000.0 g_hash_table_insert_internal 6193 1.7585 libgobject-2.0.so.0.4000.0 g_type_check_instance_is_a 5657 1.6063 libglib-2.0.so.0.4000.0 g_slice_alloc 5229 1.4848 libglib-2.0.so.0.4000.0 g_pointer_bit_unlock 4754 1.3499 libgio-2.0.so.0.4000.0 scan_for_newline 4685 1.3303 libglib-2.0.so.0.4000.0 g_datalist_id_set_data_full 4502 1.2784 libgobject-2.0.so.0.4000.0 g_type_check_instance_cast 3922 1.1137 libglib-2.0.so.0.4000.0 slab_allocator_alloc_chunk 3848 1.0927 libgio-2.0.so.0.4000.0 g_input_stream_get_type 3831 1.0878 libpthread-2.18.so pthread_mutex_lock 3671 1.0424 libpthread-2.18.so pthread_mutex_unlock 3520 0.9995 libosinfo-1.0.so.0.2.10 osinfo_device_get_type 3016 0.8564 libc-2.18.so malloc_consolidate 2773 0.7874 libglib-2.0.so.0.4000.0 g_pointer_bit_lock 2763 0.7846 libpthread-2.18.so pthread_getspecific 2654 0.7536 libc-2.18.so calloc 2593 0.7363 libc-2.18.so _IO_default_xsputn 2459 0.6982 libxml2.so.2.9.1 xmlXPathCompOpEval.part.59 2422 0.6877 libgobject-2.0.so.0.4000.0 g_type_class_ref 2350 0.6673 libc-2.18.so strlen 2269 0.6443 libc-2.18.so __memcpy_sse2_unaligned 2257 0.6409 libglib-2.0.so.0.4000.0 g_hash_table_insert_node 2115 0.6006 libc-2.18.so __GI___strcmp_ssse3 2091 0.5937 libglib-2.0.so.0.4000.0 g_datalist_id_dup_data 2029 0.5761 libgobject-2.0.so.0.4000.0 g_type_class_peek_static 1995 0.5665 libosinfo-1.0.so.0.2.10 osinfo_list_get_type 1967 0.5585 libosinfo-1.0.so.0.2.10 osinfo_db_get_type I believe that much of the libc malloc overhead comes from XML parsing still, but the g_str_hash function appearence does suggest it might be worth optimizing our hash table usage to use g_intern_string or GQuark so we can do plain pointer comparisons instead of strcmp. I'm surprised that osinfo_entity_get_type is soo high up there too. Perhaps we do need to optimize that so we don't call it so frequently during parsing. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| _______________________________________________ Libosinfo mailing list Libosinfo@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libosinfo