Author: Date: Subject:
Ingo Molnar
2008-07-18 11:09:51
Re: [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison
overwritten
* Pekka Enberg <[email protected]> wrote:
> On Fri, Jul 18, 2008 at 8:46 AM, Evgeniy Polyakov <[email protected]> wrote:
> > Does SLUB have a debug check at freeing time? If so, how does it work
> > and why didn't it caught use after free there?
>
> You can't detect use after free before the object is actually free'd
> ;-)
yeah, we want to check use-after free at the next allocation point -
i.e. as late as possible to gather all corruptions that happened
meanwhile.
We could in theory have a SLUB debug mode where a SCHED_IDLE kernel
thread would periodically check all free objects (of that CPU) in the
background to ensure their integrity. That would catch corruptions
sooner, with a possibly still meaningful context to print out. [right
after the IRQ or process that corrupts them finishes running]
It could also be hooked into ftrace to print out the last few hundred
kernel function calls executed prior any corruption. ftrace/slub-debug
plugin perhaps?
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Author: Date: Subject:
Pekka Enberg
2008-07-18 12:15:26
Re: [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison overwritten
Hi Ingo,
On Fri, Jul 18, 2008 at 12:09 PM, Ingo Molnar <[email protected]> wrote:
> yeah, we want to check use-after free at the next allocation point -
> i.e. as late as possible to gather all corruptions that happened
> meanwhile.
>
> We could in theory have a SLUB debug mode where a SCHED_IDLE kernel
> thread would periodically check all free objects (of that CPU) in the
> background to ensure their integrity. That would catch corruptions
> sooner, with a possibly still meaningful context to print out. [right
> after the IRQ or process that corrupts them finishes running]
>
> It could also be hooked into ftrace to print out the last few hundred
> kernel function calls executed prior any corruption. ftrace/slub-debug
> plugin perhaps?
Well, there's this Norwegian guy, Vegard, who has written a small
piece of code that can detect use-after-free _as it happens_. I think
he calls the thing kmemcheck :-).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Author: Date: Subject:
Evgeniy Polyakov
2008-07-18 14:16:24
Re: [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison overwritten
Hi Pekka.
On Fri, Jul 18, 2008 at 12:02:26PM +0300, Pekka Enberg ([email protected]) wrote:
> > Out of curiosity, why does it scream at allocation time?
>
> Because it's checking for use-after-free errors. The object is
> poisoned with POISON_FREE when it's free'd and we verify the poison
> values at allocation time.
Does it also scream on double free event? Just to closer guilty
circles... 0x9c offset is somewhere at the very end of the skbuff
structure, likely skb->users.
Can you also check in some kind of this patch to catch freed skb freeing
for testing?
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3666216..dda96bf 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -419,6 +419,14 @@ void kfree_skb(struct sk_buff *skb)
{
if (unlikely(!skb))
return;
+
+ {
+ u8 *ptr = (u8 *)(&skb->users);
+
+ if (*ptr == POISON_FREE || *ptr == POISON_INUSE || *ptr == POISON_END)
+ BUG();
+ }
+
if (likely(atomic_read(&skb->users) == 1))
smp_rmb();
else if (likely(!atomic_dec_and_test(&skb->users)))
--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Author: Date: Subject:
Pekka Enberg
2008-07-18 17:44:44
Re: [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison overwritten
Hi Evgeniy,
On Fri, Jul 18, 2008 at 12:02:26PM +0300, Pekka Enberg
([email protected]) wrote:
>> > Out of curiosity, why does it scream at allocation time?
>>
>> Because it's checking for use-after-free errors. The object is
>> poisoned with POISON_FREE when it's free'd and we verify the poison
>> values at allocation time.
On Fri, Jul 18, 2008 at 1:16 PM, Evgeniy Polyakov <[email protected]> wrote:
> Does it also scream on double free event? Just to closer guilty
> circles... 0x9c offset is somewhere at the very end of the skbuff
> structure, likely skb->users.
Yeah. See the free_debug_processing() function in mm/slub.c for
details (the on_freelist() part). However, if you look at slab_free()
you can see that in the SLUB fast-path we don't do any of these
debugging checks. So you can end up with slab corruption without a
nice error message.
Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Author: Date: Subject:
Christoph Lameter
2008-07-18 09:48:21
Re: [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison overwritten
Pekka Enberg wrote:
> Yeah. See the free_debug_processing() function in mm/slub.c for
> details (the on_freelist() part). However, if you look at slab_free()
> you can see that in the SLUB fast-path we don't do any of these
> debugging checks. So you can end up with slab corruption without a
> nice error message.
The slub fastpath is not used when debugging is enabled.
Without debugging on double frees will typically corrupt the freepointer. So you get an invalid pointer reference in __slab_free.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Author: Date: Subject:
Evgeniy Polyakov
2008-07-18 20:07:47
Re: [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison overwritten
Hi Pekka.
On Fri, Jul 18, 2008 at 05:44:44PM +0300, Pekka Enberg ([email protected]) wrote:
> Yeah. See the free_debug_processing() function in mm/slub.c for
> details (the on_freelist() part). However, if you look at slab_free()
> you can see that in the SLUB fast-path we don't do any of these
> debugging checks. So you can end up with slab corruption without a
> nice error message.
Yup, I see.
Actually with the network there will be no duuble free as long as uses
use kfree_skb(), since first freeing (with skb->users being equal to 1,
so user owns this skb) will fill skb with 0x6b, so any subsequent
kfree_skb() will try to decrement particulary huge reference counter
(0x6b6b6b6b) and actually will never reach 1 again, so things will look
quite clear, but if the same slub area will be reused for the next skb
before it is freed second time, very hard-to-debug data corruption will
happen, so I belive this kind of checks should be done into kfree_skb()
(of course under special debug compile option). I posted that patch just
for the reference and that Ingo would be able to detect where skb is
used after already being freed.
--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Author: Date: Subject:
Pekka J Enberg
2008-07-18 12:00:33
Re: [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison overwritten
Hi Ingo,
On Thu, 17 Jul 2008, Ingo Molnar wrote:
> A regression to v2.6.26:
>
> I started getting this skb-head corruption message today, on a T60
> laptop with e1000:
[snip]
On Thu, 17 Jul 2008, Ingo Molnar wrote:
> Perhaps SLUB debugging got smarter?
Nope.
On Thu, 17 Jul 2008, Ingo Molnar wrote:
> PM: Removing info for No Bus:vcs11
> device: 'vcs11': device_create_release
> =============================================================================
> BUG skbuff_head_cache: Poison overwritten
> -----------------------------------------------------------------------------
>
> INFO: 0xf658ae9c-0xf658ae9c. First byte 0x6a instead of 0x6b
0x6b is POISON_FREE so 0x6a is one bit corruption.
> INFO: Allocated in __alloc_skb+0x2c/0x110 age=0 cpu=0 pid=5098
> INFO: Freed in __kfree_skb+0x31/0x80 age=0 cpu=1 pid=4440
> INFO: Slab 0xc16cc140 objects=16 used=1 fp=0xf658ae00 flags=0x400000c3
> INFO: Object 0xf658ae00 @offset=3584 fp=0xf658af00
>
> Bytes b4 0xf658adf0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
> Object 0xf658ae00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
> Object 0xf658ae10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
> Object 0xf658ae20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
> Object 0xf658ae30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
> Object 0xf658ae40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
> Object 0xf658ae50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
> Object 0xf658ae60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
> Object 0xf658ae70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
It's bit unfortunate that we don't see full dump of the corruption here
because SLUB limits the output to 128 bytes. Ingo, you might want to try
this patch so that we can see all of it:
diff --git a/mm/slub.c b/mm/slub.c
index 5f6e2c4..f69d181 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -492,7 +492,7 @@ static void print_trailer(struct kmem_cache *s, struct page *page, u8 *p)
if (p > addr + 16)
print_section("Bytes b4", p - 16, 16);
- print_section("Object", p, min(s->objsize, 128));
+ print_section("Object", p, s->objsize);
if (s->flags & SLAB_RED_ZONE)
print_section("Redzone", p + s->objsize,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
|