注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Tsecer的回音岛

Tsecer的博客

 
 
 

日志

 
 

再从pthread_cancel看如何识别信号处理函数栈帧  

2015-09-12 09:37:35|  分类: Glibc分析 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
一、依然从线程退出时C++局部变量自动析构开始说起
之前的一篇日志中,曾经提到过(其实也就只是注意到)当线程被pthread_cancel之后,局部对象的析构函数依然会被执行,但是当时忽略了所有的实现细节(当然这里也不会揭示所有的细节),其中最重要、最明显的一个细节就是pthread_cancel时通过信号来通知目标线程的,目标线程在堆栈展开时,需要在一个信号处理函数中完成。大家知道,信号处理函数的的堆栈是操作系统生成的,所以和用户线程的“天然”堆栈之间有一条鸿沟,如果要再信号处理函数中执行堆栈展开,首先要解决的基本问题就是要识别并逾越这条鸿沟,大致一想,这个问题还比较棘手。
在glibc的pthread库的实现中,cancel的信号处理函数在glibc-2.7\nptl\init.c文件中实现__pthread_initialize_minimal_internal:
  /* Install the cancellation signal handler.  If for some reason we
     cannot install the handler we do not abort.  Maybe we should, but
     it is only asynchronous cancellation which is affected.  */
  struct sigaction sa;
  sa.sa_sigaction = sigcancel_handler;
  sa.sa_flags = SA_SIGINFO;
  __sigemptyset (&sa.sa_mask);

  (void) __libc_sigaction (SIGCANCEL, &sa, NULL);
同样以上一篇日志中的代码为例,通过调试看下对于信号的处理逻辑。为了让这篇单薄的日志丰满起来,这里再把那个测试的代码完整的拷贝一份:
[tsecer@Harry pthreadunwind.cpp]$ cat pthreadunwind.cpp 
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

struct dest {
~dest(){printf("in %s\n",__FUNCTION__);}
}
;
void * worker(void * arg)
{
    dest mylocal;
    printf("willing sleep \n");
    sleep(1000);
    printf("after sleep 1000\n");
}
int main()
{
    pthread_t thread;
    pthread_create(&thread,NULL,worker,NULL);
    sleep(2);
    printf("killing worker\n");
    pthread_cancel(thread);
    sleep(20000);
}

[tsecer@Harry pthreadunwind.cpp]$ g++ pthreadunwind.cpp -g -lpthread
[tsecer@Harry pthreadunwind.cpp]$ gdb ./a.out 
GNU gdb (GDB) Fedora (7.0-3.fc12)
……
(gdb) b sigcancel_handler
Function "sigcancel_handler" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (sigcancel_handler) pending.
(gdb) r
Starting program: /home/tsecer/CodeTest/pthreadunwind.cpp/a.out 
[Thread debugging using libthread_db enabled]
[New Thread 0xb7fe7b70 (LWP 14739)]
willing sleep 
killing worker
[Switching to Thread 0xb7fe7b70 (LWP 14739)]

Breakpoint 1, 0x003903e3 in sigcancel_handler () from /lib/libpthread.so.0
(gdb) bt
#0  0x003903e3 in sigcancel_handler () from /lib/libpthread.so.0
#1  <signal handler called>
#2  0x00462424 in __kernel_vsyscall ()
#3  0x002a7356 in nanosleep () from /lib/libc.so.6
#4  0x002a7180 in sleep () from /lib/libc.so.6
#5  0x08048614 in worker (arg=0x0) at pthreadunwind.cpp:14
#6  0x00391925 in start_thread () from /lib/libpthread.so.0
#7  0x002e707e in clone () from /lib/libc.so.6
(gdb) 
二、glibc如何进行堆栈展开
从代码上看,gcc经过一些封装和周转,最后把这个具体工作交给了gcc来完成。工作的交接在函数glibc-2.7\nptl\unwind.c中实现__pthread_unwind:
#ifdef HAVE_FORCED_UNWIND
  /* This is not a catchable exception, so don't provide any details about
     the exception type.  We do need to initialize the field though.  */
  THREAD_SETMEM (self, exc.exception_class, 0);
  THREAD_SETMEM (self, exc.exception_cleanup, unwind_cleanup);

  _Unwind_ForcedUnwind (&self->exc, unwind_stop, ibuf);
#else
在进入gcc之后,代码更加复杂,经过一些反汇编及调试(此处省去冗长的具体过程),可以发现gcc的堆栈展开中的确是处理了信号堆栈这种特殊情况。乍看起来,非常不可思议,就好像把党的继承人写入党章和宪法一样,但是gcc就是这么做了,只要能够在大部分情况下实现功能,这些实现的方法可以不用那么上流,更不能为了所谓的上流而拒绝实现这个功能。言归正传,gcc中对于信号堆栈的处理在文件gcc-4.1.0\gcc\config\i386\linux-unwind.h中实现:
static _Unwind_Reason_Code
x86_fallback_frame_state (struct _Unwind_Context *context,
 _Unwind_FrameState *fs)
{
  unsigned char *pc = context->ra;
  struct sigcontext *sc;
  long new_cfa;

  /* popl %eax ; movl $__NR_sigreturn,%eax ; int $0x80  */
  if (*(unsigned short *)(pc+0) == 0xb858
      && *(unsigned int *)(pc+2) == 119
      && *(unsigned short *)(pc+6) == 0x80cd)
    sc = context->cfa + 4;
  /* movl $__NR_rt_sigreturn,%eax ; int $0x80  */
  else if (*(unsigned char *)(pc+0) == 0xb8
  && *(unsigned int *)(pc+1) == 173
  && *(unsigned short *)(pc+5) == 0x80cd)
    {
      struct rt_sigframe {
int sig;
struct siginfo *pinfo;
void *puc;
struct siginfo info;
struct ucontext uc;
      } *rt_ = context->cfa;
      /* The void * cast is necessary to avoid an aliasing warning.
         The aliasing warning is correct, but should not be a problem
         because it does not alias anything.  */
      sc = (struct sigcontext *) (void *) &rt_->uc.uc_mcontext;
    }
  else
    return _URC_END_OF_STACK;
也就是说,gcc会在堆栈中寻找对于sigreturn系统调用的指令序列,这个也正是操作系统对于信号处理函数的堆栈实现。
三、内核对于信号堆栈的处理
linux-2.6.21\arch\i386\kernel\signal.c文件setup_frame函数:
/*
* This is popl %eax ; movl $,%eax ; int $0x80
*
* WE DO NOT USE IT ANY MORE! It's only left here for historical
* reasons and because gdb uses it as a signature to notice
* signal handler stack frames.
*/
err |= __put_user(0xb858, (short __user *)(frame->retcode+0));
err |= __put_user(__NR_sigreturn, (int __user *)(frame->retcode+2));
err |= __put_user(0x80cd, (short __user *)(frame->retcode+6));
其中__NR_sigreturn在linux-2.6.21\include\asm-i386\unistd.h中定义为:
#define __NR_fsync 118
#define __NR_sigreturn 119
#define __NR_clone 120
四、gdb如何识别信号栈帧
从前面的堆栈展示上看,gdb也准确的识别处理这个信号处理函数的堆栈。由于这个特征比较明显("signal handler called"这个字符串在gdb源代码中只出现一次),从gdb的源代码中找到信号堆栈的识别要比从gcc中查找要简单很多,所以具体细节也省略,只是把关键代码列一下:
gdb-6.0\gdb\i386-linux-tdep.c
/* Recognizing signal handler frames.  */

/* GNU/Linux has two flavors of signals.  Normal signal handlers, and
   "realtime" (RT) signals.  The RT signals can provide additional
   information to the signal handler if the SA_SIGINFO flag is set
   when establishing a signal handler using `sigaction'.  It is not
   unlikely that future versions of GNU/Linux will support SA_SIGINFO
   for normal signals too.  */

/* When the i386 Linux kernel calls a signal handler and the
   SA_RESTORER flag isn't set, the return address points to a bit of
   code on the stack.  This function returns whether the PC appears to
   be within this bit of code.

   The instruction sequence for normal signals is
       pop    %eax
       mov    $0x77, %eax
       int    $0x80
   or 0x58 0xb8 0x77 0x00 0x00 0x00 0xcd 0x80.

   Checking for the code sequence should be somewhat reliable, because
   the effect is to call the system call sigreturn.  This is unlikely
   to occur anywhere other than a signal trampoline.

   It kind of sucks that we have to read memory from the process in
   order to identify a signal trampoline, but there doesn't seem to be
   any other way.  The PC_IN_SIGTRAMP macro in tm-linux.h arranges to
   only call us if no function name could be identified, which should
   be the case since the code is on the stack.

   Detection of signal trampolines for handlers that set the
   SA_RESTORER flag is in general not possible.  Unfortunately this is
   what the GNU C Library has been doing for quite some time now.
   However, as of version 2.1.2, the GNU C Library uses signal
   trampolines (named __restore and __restore_rt) that are identical
   to the ones used by the kernel.  Therefore, these trampolines are
   supported too.  */

#define LINUX_SIGTRAMP_INSN0 0x58 /* pop %eax */
#define LINUX_SIGTRAMP_OFFSET0 0
#define LINUX_SIGTRAMP_INSN1 0xb8 /* mov $NNNN, %eax */
#define LINUX_SIGTRAMP_OFFSET1 1
#define LINUX_SIGTRAMP_INSN2 0xcd /* int */
#define LINUX_SIGTRAMP_OFFSET2 6

static const unsigned char linux_sigtramp_code[] =
{
  LINUX_SIGTRAMP_INSN0, /* pop %eax */
  LINUX_SIGTRAMP_INSN1, 0x77, 0x00, 0x00, 0x00, /* mov $0x77, %eax */
  LINUX_SIGTRAMP_INSN2, 0x80 /* int $0x80 */
};
五、pthread_exit会执行堆栈展开吗
glibc-2.7\nptl\pthread_exit.c
void
__pthread_exit (value)
     void *value;
{
  THREAD_SETMEM (THREAD_SELF, result, value);

  __do_cancel ();
}
这个__do_cancel也是pthread_cancel的信号处理函数中调用的实现函数,所以说pthread_exit也会进行局部对象析构
/* For asynchronous cancellation we use a signal.  This is the handler.  */
static void
sigcancel_handler (int sig, siginfo_t *si, void *ctx)
{
……
      if (curval == oldval)
{
 /* Set the return value.  */
 THREAD_SETMEM (self, result, PTHREAD_CANCELED);

 /* Make sure asynchronous cancellation is still enabled.  */
 if ((newval & CANCELTYPE_BITMASK) != 0)
   /* Run the registered destructors and terminate the thread.  */
   __do_cancel ();

 break;
}
……
}
测试代码在前一个代码基础上简单改造,大家将就看下:
[tsecer@Harry pthreadunwind.cpp]$ cat pthreadexit.cpp 
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

struct dest {
~dest(){printf("in %s\n",__FUNCTION__);}
}
;
void * worker(void * arg)
{
    dest mylocal;
    printf("exiting \n");
    pthread_exit(NULL);
    sleep(1000);
    printf("after exit 1000\n");
}
int main()
{
    pthread_t thread;
    pthread_create(&thread,NULL,worker,NULL);
    sleep(2);
    printf("killing worker\n");
    pthread_cancel(thread);
    sleep(20000);
}

[tsecer@Harry pthreadunwind.cpp]$ g++ pthreadexit.cpp -g -lpthread -o pthreadexit
[tsecer@Harry pthreadunwind.cpp]$ ./pthreadexit 
exiting 
in ~dest
^C

  评论这张
 
阅读(286)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017