Michael, thank you for reply! My comments below.
Michael Blizek wrote:
Problem happens at c026023c line:
if (unlikely(buf->offset + length > chan->subbuf_size))
c026023c: 8b 55 08 mov 0x8(%ebp),%edx
c026023f: 01 da add %ebx,%edx
c0260241: 3b 50 04 cmp 0x4(%eax),%edx
c0260244: 76 0b jbe c0260251 <_ipfix_send_msg+0x62>
...
The error is in relay_write which is inside _ipfix_send_msg in the assembly
due to inlining.
Yes, I'm aware of that.
static inline void relay_write(struct rchan *chan,
const void *data,
size_t length)
{
unsigned long flags;
struct rchan_buf *buf;
local_irq_save(flags);
buf = chan->buf[smp_processor_id()];
if (unlikely(buf->offset + length > chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
Here it is:
register states after the crash:
eax = ee5d4a00
edx = 00000001
ebp = 0000332e
buf = chan->buf[smp_processor_id()];
c0260231: 64 8b 15 04 60 3e c0 mov %fs:0xc03e6004,%edx
load smp_processor_id() into edx (result value is 1, meaning it is the second
cpu, because counting starts at 1)
c0260238: 8b 6c 90 20 mov 0x20(%eax,%edx,4),%ebp
eax stores chan
The instruction means dereference what is in eax + 20(hex) + edx*4 and store
it in ebp. ebp then contains buf (20 is probably the offset of buf). ebp
contains 0000332e afterwards, which does not look like a valid address.
if (unlikely(buf->offset + length > chan->subbuf_size))
c026023c: 8b 55 08 mov 0x8(%ebp),%edx
This line means dereference ebp + 8 (8 is probably the offset of "offset") and
store it in edx. Here it crashes, because ebp does not contain a valid address.
==> You probably have not initialised all chan->buf entries or made
chan->buf too small.
BTW: Linux has a built in per-cpu "library": http://lwn.net/Articles/258238/
Very detailed explanation, thank you. I'm not that good with asm so it's
very valuable information for
me. So I use relay subsystem correctly (as it documented in
Documentation/filesystem/relay.txt), it works
for some time and then crashes (only on SMP system), can I safely assume
this is Relay bug and report it?
I initialize it with:
#define RELAY_BUF_SZ 65536
#define NR_RELAY_BUF 16
nf_pool.rchan = relay_open("netflow", NULL, RELAY_BUF_SZ,
NR_RELAY_BUF,
&nf_relay_callbacks, NULL);
nf_relay_callbacks are taken from documentation as well:
static struct rchan_callbacks nf_relay_callbacks = {
.subbuf_start = nf_subbuf_start_callback,
.create_buf_file = nf_create_buf_file_callack,
.remove_buf_file = nf_remove_buf_file_callback,
};
static int nf_subbuf_start_callback(struct rchan_buf *buf, void *subbuf,
void *prev_subbuf, size_t prev_padding)
{
struct nf_netlink_msg m;
memset(&m, 0, sizeof(m));
if (prev_subbuf)
*((unsigned *)prev_subbuf) = prev_padding;
/* Lost buffer, going overwrite */
if (relay_buf_full(buf))
atomic_inc(&nf_pool.lost_rec);
subbuf_start_reserve(buf, sizeof(unsigned int));
m.cpu = buf->cpu;
if (likely(buf->subbufs_produced > 0))
nf_send_msg(&m);
return 1;
}
static struct dentry *nf_create_buf_file_callack(const char *filename,
struct dentry *parent,
int mode,
struct rchan_buf *buf,
int *is_global)
{
return debugfs_create_file(filename, mode, parent, buf,
&relay_file_operations);
}
static int nf_remove_buf_file_callback(struct dentry *dentry)
{
debugfs_remove(dentry);
return 0;
}
Then I just call relay_write() when I need to write something.
relay_write(nf_pool.rchan, buf->buf, buf->buflen);
Thanks,
-- Alexey.
--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ