[PATCH RFCv2 00/27] memory optimizations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



this patch series aims to save memory allocations and some system calls
related to PA's client/server protocol implementation

v2 incorporates David's and Alexander's suggestions; no functional change to
the first 17 patches; v2 then adds some cleanup and two optimizations


patches 1 to 5 ('tagstruct:') introduce a new tagstruct type _APPENDED
which can hold tagstruct data up to a certain size; tagstructs are now 
kept in a specific free-list -- this typically replaces two malloc()/free()s
with one flist push()/pop()

patches 6 to 8 ('packet:') make packets fixed-size (typically); packets are
kept in a specific free-list -- this replaces one malloc()/free() with one
flist push()/pop()

patches 9 to 14 ('pstream:') allows to send tagstructs directly to a pstream
without encapsulation in a packet -- this saves one flist push()/pop()

patches 15 and 16 ('pstream') often save a read() call by reading more than
just the descriptor (up to 40 bytes, e.g. description (20 bytes) + shm
info (16 bytes)); the idea is similar to b4342845d, "Optimize write 
of smaller packages", but for read -- this trades some extra memcpy() for
a read()

patch 17 ('iochannel') fixes a strange behaviour in iochannel/mainloop that
deleted the input_event with every read which caused a rebuild of the pollfds
for every read()!

v2 material:

patches 18 to 20 ('queue', 'pstream') aim to combine two write items into one
minibuffer by peeking ahead in the send queue

patch 21 inlines pa_run_once() as this function came out high in profiling

patch 25 ('mainloop') only clears the wakeup pipe when poll() indicates that
the pipe is readable; if the only ready file descriptor is the wakeup pipe,
searching io_events can be avoided

patch 26 and 27 ('flish') removes the volatile annotation and makes flist_elem attributes
non-atomic -- needed?


with these patches typical playback (i.e. after setup) runs without any malloc()/free()
thanks to the use of free-lists; the number of memory management operations is reduced

no benchmarking yet, on i7 64-bit 'paplay --latency-msec=10 48KHz.wav" improved from
around 12% CPU to around 10% CPU for me; I plan to benchmark this on ARM soonish



Peter Meerwald (27):
  tagstruct: Distinguish pa_tagstruct_new() use cases
  tagstruct: Replace dynamic flag with type
  tagstruct: Get rid of pa_tagstruct_free_data()
  tagstruct: Add type _APPENDED
  tagstruct: Use flist to potentially save calls to malloc()/free()
  packet: Hide internals of pa_packet, introduce pa_packet_data()
  packet: Make pa_packet_new() create fixed-size packets
  packet: Introduce pa_packet_new_data() to copy data into a newly
    created packet
  packet: Use flist to save calls to malloc()/free()
  pstream: Unionize item_info
  pstream: Add pa_pstream_send_tagstruct()
  pstream: #define PA_PSTREAM_SHM_SIZE
  pstream: Duplicate assignment, write.data is always NULL
  pstream: Only reset memchunk if it has been used
  pstream: Split up do_read()
  pstream: Use small minibuffer to combine several read()s if possible
  iochannel: Fix channel enable
  queue: Add pa_queue_peek() function
  pstream: Add helper functions reset_descriptor(), shm_descriptor()
  pstream: Peek into next item on send queue to see if it can be put
    into minibuffer together with current item
  once: Inline functions
  rtpoll: typo
  rtpoll: Fix condition for DEBUG_TIMING output
  rtpoll: Drop extra wait_op argument to pa_rtpoll_run()
  mainloop: Clear wakeup pipe only when necessary
  flist: Don't use atomic operations to manipulate ptr, next
  flist: Don't make flist volatile

 src/modules/alsa/alsa-sink.c                 |   2 +-
 src/modules/alsa/alsa-source.c               |   2 +-
 src/modules/module-card-restore.c            |   4 +-
 src/modules/module-combine-sink.c            |   2 +-
 src/modules/module-device-manager.c          |  12 +-
 src/modules/module-device-restore.c          |  16 +-
 src/modules/module-esound-sink.c             |   2 +-
 src/modules/module-null-sink.c               |   2 +-
 src/modules/module-null-source.c             |   2 +-
 src/modules/module-pipe-sink.c               |   2 +-
 src/modules/module-pipe-source.c             |   2 +-
 src/modules/module-sine-source.c             |   2 +-
 src/modules/module-stream-restore.c          |  12 +-
 src/modules/module-tunnel.c                  |  54 +-
 src/modules/oss/module-oss.c                 |   2 +-
 src/modules/raop/module-raop-sink.c          |   2 +-
 src/pulse/context.c                          |  26 +-
 src/pulse/ext-device-manager.c               |  14 +-
 src/pulse/ext-device-restore.c               |  10 +-
 src/pulse/ext-stream-restore.c               |  10 +-
 src/pulse/introspect.c                       |  82 +--
 src/pulse/mainloop.c                         |  24 +-
 src/pulse/scache.c                           |  10 +-
 src/pulse/stream.c                           |  22 +-
 src/pulse/subscribe.c                        |   2 +-
 src/pulsecore/flist.c                        |  14 +-
 src/pulsecore/flist.h                        |   2 +-
 src/pulsecore/iochannel.c                    |  35 +-
 src/pulsecore/once.c                         |  18 +-
 src/pulsecore/once.h                         |  27 +-
 src/pulsecore/packet.c                       |  55 +-
 src/pulsecore/packet.h                       |  20 +-
 src/pulsecore/pdispatch.c                    |   9 +-
 src/pulsecore/protocol-native.c              |  94 ++--
 src/pulsecore/pstream-util.c                 |  33 +-
 src/pulsecore/pstream-util.h                 |   2 -
 src/pulsecore/pstream.c                      | 728 +++++++++++++++++----------
 src/pulsecore/pstream.h                      |   2 +
 src/pulsecore/queue.c                        |  11 +
 src/pulsecore/queue.h                        |   3 +
 src/pulsecore/rtpoll.c                       |  14 +-
 src/pulsecore/rtpoll.h                       |   7 +-
 src/pulsecore/tagstruct.c                    |  67 ++-
 src/pulsecore/tagstruct.h                    |   4 +-
 src/tests/rtpoll-test.c                      |   4 +-
 src/tests/srbchannel-test.c                  |  21 +-
 47 files changed, 919 insertions(+), 616 deletions(-)

-- 
1.9.1



[Index of Archives]     [Linux Audio Users]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux