注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Tsecer的回音岛

Tsecer的博客

 
 
 

日志

 
 

apache后台cgi挂掉之后现场还原  

2013-01-27 21:30:58|  分类: apache分析 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
一、apache的实现
对于apache的实现,默认配置的时候是不支持cgi模式的,这里的cgi模式就是cgid_mod的加载。如果没有加载该模块,当把该文件放入cgi_bin文件之后,从浏览器获得该文件的时候,httpd并不是执行该文件并将文件的输出返回,有意思的是,浏览器将会直接将请求的cgi文件整个下载下来。这里只是描述了现象,具体的细节分析还是比较周折的,所以暂时就不展开了。
二、cgid的模式
等待操作
(gdb) bt
#0  0x0095a424 in __kernel_vsyscall ()
#1  0x003990d1 in accept () from /lib/libpthread.so.0
#2  0x0016a41b in cgid_server (data=0x80f3ea8) at mod_cgid.c:686
#3  0x0016ad46 in cgid_start (p=0x80cf0a8, main_server=0x80f3ea8,
    procnew=0x80eebe8) at mod_cgid.c:876
#4  0x0016b06c in cgid_init (p=0x80cf0a8, plog=0x812faa0, ptemp=0x8133ab0,
    main_server=0x80f3ea8) at mod_cgid.c:939
#5  0x0808ab6f in ap_run_post_config (pconf=0x80cf0a8, plog=0x812faa0,
    ptemp=0x8133ab0, s=0x80f3ea8) at config.c:105
#6  0x08068e41 in main (argc=2, argv=0xbffff3b4) at main.c:765
(gdb)
  │   │   └─gdb,23680 -p 23597
  │   │       └─httpd,23903 -X 整个是主进程,也就是在命令行中启动的 httpd进程
  │   │           ├─httpd,23904 -X 整个就是我们所熟悉的cgid进程,大家看cgid模块的代码可以知道,cgid是会以单独的进程形式存在的,或者更加形而上的说,它是通过fork创建的新进程
  │   │           │   ├─first.pl,23951 /usr/local/apache2/cgi-bin/first.pl
  │   │           │   ├─first.pl,23959 /usr/local/apache2/cgi-bin/first.pl
  │   │           │   ├─first.pl,23976 /usr/local/apache2/cgi-bin/first.pl
  │   │           │   └─mysleeper,23975

(gdb) bt
#0  0x0095a424 in __kernel_vsyscall ()
#1  0x003990d1 in accept () from /lib/libpthread.so.0 cgid成为单独进程之后,它在一个和主进程预定好的unix socket上执行accept操作,等待不同的worker 进程connect过来,connect过来的进程将请求的内容发送给cgid,cgid然后fork出一个子进程,该进程就是cgi中指定的可执行文件,然后此时cgid的接入功能就算完成了,此时worker进程和新创建的cgi进程之间就可以通过socket来交互了,因为此时worker线程的socket端经过cgid的accept已经转给了真正的cgi进程,它们已经联通,此时从cgid从整个事件中退出
#2  0x0016a41b in cgid_server (data=0x80f3ea8) at mod_cgid.c:686
#3  0x0016ad46 in cgid_start (p=0x80cf0a8, main_server=0x80f3ea8,
    procnew=0x80eebe8) at mod_cgid.c:876
#4  0x0016b06c in cgid_init (p=0x80cf0a8, plog=0x812faa0, ptemp=0x8133ab0,
    main_server=0x80f3ea8) at mod_cgid.c:939
#5  0x0808ab6f in ap_run_post_config (pconf=0x80cf0a8, plog=0x812faa0,
    ptemp=0x8133ab0, s=0x80f3ea8) at config.c:105
#6  0x08068e41 in main (argc=2, argv=0xbffff3b4) at main.c:765
(gdb)
(gdb) info thread
* 1 Thread 0xb7fe79c0 (LWP 23904)  0x0095a424 in __kernel_vsyscall ()
(gdb)

cgi的操作
(gdb) bt
#0  0x0039b514 in fork () from /lib/libpthread.so.0
#1  0x0028bef4 in apr_proc_create (new=0x81aff60,
    progname=0x81b00b0 "/usr/local/apache2/cgi-bin/mysleeper", args=0x81b0a20,
    env=0x81b00f8, attr=0x81b06b0, pool=0x81afd68)
    at threadproc/unix/proc.c:391
#2  0x0016ab3c in cgid_server (data=0x80f3ea8) at mod_cgid.c:817
#3  0x0016ad46 in cgid_start (p=0x80cf0a8, main_server=0x80f3ea8,
    procnew=0x80eebe8) at mod_cgid.c:876
#4  0x0016b06c in cgid_init (p=0x80cf0a8, plog=0x812faa0, ptemp=0x8133ab0,
    main_server=0x80f3ea8) at mod_cgid.c:939
#5  0x0808ab6f in ap_run_post_config (pconf=0x80cf0a8, plog=0x812faa0,
    ptemp=0x8133ab0, s=0x80f3ea8) at config.c:105
#6  0x08068e41 in main (argc=2, argv=0xbffff3b4) at main.c:765
(gdb)
httpd-2.4.2\modules\generators\mod_cgid.c
static apr_status_t handle_exec(include_ctx_t *ctx, ap_filter_t *f,
                                apr_bucket_brigade *bb)
static int cgid_handler(request_rec *r)
{
    int retval, nph, dbpos;
三、cgi进程退出之后谁来执行子进程的waitpid操作
例如说,cgi不小心自己挂掉了,比方说coredump了,此时前台将会如何处理呢?行为未定义不是答案。
SigQ:    0/8192
SigPnd:    0000000000000000
ShdPnd:    0000000000000000
SigBlk:    0000000000000000
SigIgn:    0000000001011000 这里父进程是直接忽略了子进程的退出
SigCgt:    0000000180000001
CapInh:    0000000000000000
CapPrm:    0000000000000000
CapEff:    0000000000000000
CapBnd:    ffffffffffffffff
Cpus_allowed:    1
Cpus_allowed_list:    0
Mems_allowed:    1
Mems_allowed_list:    0
voluntary_ctxt_switches:    199
nonvoluntary_ctxt_switches:    15
[root@Harry ~]# kill -l
 1) SIGHUP     2) SIGINT     3) SIGQUIT     4) SIGILL     5) SIGTRAP
 6) SIGABRT     7) SIGBUS     8) SIGFPE     9) SIGKILL    10) SIGUSR1
11) SIGSEGV    12) SIGUSR2    13) SIGPIPE    14) SIGALRM    15) SIGTERM
16) SIGSTKFLT    17) SIGCHLD    18) SIGCONT    19) SIGSTOP    20) SIGTSTP
21) SIGTTIN    22) SIGTTOU    23) SIGURG    24) SIGXCPU    25) SIGXFSZ
26) SIGVTALRM    27) SIGPROF    28) SIGWINCH    29) SIGIO    30) SIGPWR
31) SIGSYS    34) SIGRTMIN    35) SIGRTMIN+1    36) SIGRTMIN+2    37) SIGRTMIN+3
38) SIGRTMIN+4    39) SIGRTMIN+5    40) SIGRTMIN+6    41) SIGRTMIN+7    42) SIGRTMIN+8
43) SIGRTMIN+9    44) SIGRTMIN+10    45) SIGRTMIN+11    46) SIGRTMIN+12    47) SIGRTMIN+13
48) SIGRTMIN+14    49) SIGRTMIN+15    50) SIGRTMAX-14    51) SIGRTMAX-13    52) SIGRTMAX-12
53) SIGRTMAX-11    54) SIGRTMAX-10    55) SIGRTMAX-9    56) SIGRTMAX-8    57) SIGRTMAX-7
58) SIGRTMAX-6    59) SIGRTMAX-5    60) SIGRTMAX-4    61) SIGRTMAX-3    62) SIGRTMAX-2
63) SIGRTMAX-1    64) SIGRTMAX   
[root@Harry ~]#
也就是cgid直接忽略了子进程的退出,从而由内核直接回收掉进程,而不用等待子进程被用户态回收。当然此时我可以找到内核的相关代码证明给大家看,就像《加州招待所》中所唱的:light up the candle and she show me the way,但是这个太平常了所以就不摘录了。
四、cgi 不行coredump之后如何处理
从上面可以看到,子进程的生死是没有人介意的,但是用户在意,因为用户在等待这个cgi的返回,并且worker线程也在等待这个cgi的返回。等待者如何被唤醒呢?这个比较简单,因为进程退出之后,内核会关闭进程打开的所有文件,而对于socket来说,如果有人在该socket上执行read操作,那么此时这个痴情者将会被唤醒,所以它不用等待子进程的SIGCHILD的唤醒就可以醒过来。
此时进程被唤醒之后,它一定还处于非常懵懂的状态,就好像你早上正睡得很香,突然在你窗口放一个铁桶,然后在里面放一串1W响的鞭炮一样,从震惊中醒来。此时worker像正常的被唤醒一样去检测cgi的返回,就像你被吵醒了之后依然正常去上班一样。此时扫描cgi输出的直接接口为httpd-2.4.2\server\util_script.c
AP_DECLARE(int) ap_scan_script_header_err_core_ex(request_rec *r, char *buffer,
                                       int (*getsfunc) (char *, int, void *),
                                       void *getsfunc_data,
                                       int module_index)
    while (1) {

        int rv = (*getsfunc) (w, MAX_STRING_LEN - 1, getsfunc_data); 此时虽然文件关闭,但是cgi没有输出,此时扫描cgi输出的脚本长度为零,走到下面的输出就会返回HTTP_INTERNAL_SERVER_ERROR错误,这个错误在httpd中对应的内容为
#define HTTP_INTERNAL_SERVER_ERROR         500

        if (rv == 0) {
            const char *msg = "Premature end of script headers";
            if (first_header)
                msg = "End of script output before headers";
            ap_log_rerror(SCRIPT_LOG_MARK, APLOG_ERR|APLOG_TOCLIENT, 0, r,
                          "%s: %s", msg,
                          apr_filepath_name_get(r->filename));
            return HTTP_INTERNAL_SERVER_ERROR;
        }
        else if (rv == -1) {
            ap_log_rerror(SCRIPT_LOG_MARK, APLOG_ERR|APLOG_TOCLIENT, 0, r,
                          "Script timed out before returning headers: %s",
                          apr_filepath_name_get(r->filename));
            return HTTP_GATEWAY_TIME_OUT;
        }
此时apache server的配置中给出了这种情况发生了之后你可以执行的操作,但是这个配置默认是被注释掉的,其内容为
    409 # Some examples:
    410 #ErrorDocument 500 "The server made a boo boo."
    411 #ErrorDocument 404 /missing.html
    412 #ErrorDocument 404 "/cgi-bin/missing_handler.pl"

展示一下默认的错误提示

Internal Server Error

The server encountered an internal error or misconfiguration and was unable to complete your request.

Please contact the server administrator at you@example.com to inform them of the time this error occurred, and the actions you performed just before this error.

More information about this error may be available in the server error log.
系统错误日志中内容
     34 [Sat Jan 26 00:50:33.845682 2013] [cgid:error] [pid 24329:tid 2822667120        ] [client 192.168.203.1:4076] End of script output before headers: segv
     35 [Sat Jan 26 00:51:08.433898 2013] [cgid:error] [pid 24329:tid 2843646832        ] [client 192.168.203.1:4091] End of script output before headers: segv
     36 [Sat Jan 26 00:58:40.453014 2013] [cgid:error] [pid 24329:tid 2864626544        ] [client 192.168.203.1:4254] End of script output before headers: segv


五、apache中默认错误提示
httpd-2.4.2\modules\http\http_protocol.c
由于个人对于错误非常敏感和感兴趣,所以整个摘录过来了,希望你们也会喜欢啊,亲,so,enjoy。

/* construct and return the default error message for a given
 * HTTP defined error code
 */
static const char *get_canned_error_string(int status,
                                           request_rec *r,
                                           const char *location)
{
    apr_pool_t *p = r->pool;
    const char *error_notes, *h1, *s1;

    switch (status) {
    case HTTP_MOVED_PERMANENTLY:
    case HTTP_MOVED_TEMPORARILY:
    case HTTP_TEMPORARY_REDIRECT:
        return(apr_pstrcat(p,
                           "<p>The document has moved <a href=\"",
                           ap_escape_html(r->pool, location),
                           "\">here</a>.</p>\n",
                           NULL));
    case HTTP_SEE_OTHER:
        return(apr_pstrcat(p,
                           "<p>The answer to your request is located "
                           "<a href=\"",
                           ap_escape_html(r->pool, location),
                           "\">here</a>.</p>\n",
                           NULL));
    case HTTP_USE_PROXY:
        return(apr_pstrcat(p,
                           "<p>This resource is only accessible "
                           "through the proxy\n",
                           ap_escape_html(r->pool, location),
                           "<br />\nYou will need to configure "
                           "your client to use that proxy.</p>\n",
                           NULL));
    case HTTP_PROXY_AUTHENTICATION_REQUIRED:
    case HTTP_UNAUTHORIZED:
        return("<p>This server could not verify that you\n"
               "are authorized to access the document\n"
               "requested.  Either you supplied the wrong\n"
               "credentials (e.g., bad password), or your\n"
               "browser doesn't understand how to supply\n"
               "the credentials required.</p>\n");
    case HTTP_BAD_REQUEST:
        return(add_optional_notes(r,
                                  "<p>Your browser sent a request that "
                                  "this server could not understand.<br />\n",
                                  "error-notes",
                                  "</p>\n"));
    case HTTP_FORBIDDEN:
        return(apr_pstrcat(p,
                           "<p>You don't have permission to access ",
                           ap_escape_html(r->pool, r->uri),
                           "\non this server.</p>\n",
                           NULL));
    case HTTP_NOT_FOUND:
        return(apr_pstrcat(p,
                           "<p>The requested URL ",
                           ap_escape_html(r->pool, r->uri),
                           " was not found on this server.</p>\n",
                           NULL));
    case HTTP_METHOD_NOT_ALLOWED:
        return(apr_pstrcat(p,
                           "<p>The requested method ",
                           ap_escape_html(r->pool, r->method),
                           " is not allowed for the URL ",
                           ap_escape_html(r->pool, r->uri),
                           ".</p>\n",
                           NULL));
    case HTTP_NOT_ACCEPTABLE:
        s1 = apr_pstrcat(p,
                         "<p>An appropriate representation of the "
                         "requested resource ",
                         ap_escape_html(r->pool, r->uri),
                         " could not be found on this server.</p>\n",
                         NULL);
        return(add_optional_notes(r, s1, "variant-list", ""));
    case HTTP_MULTIPLE_CHOICES:
        return(add_optional_notes(r, "", "variant-list", ""));
    case HTTP_LENGTH_REQUIRED:
        s1 = apr_pstrcat(p,
                         "<p>A request of the requested method ",
                         ap_escape_html(r->pool, r->method),
                         " requires a valid Content-length.<br />\n",
                         NULL);
        return(add_optional_notes(r, s1, "error-notes", "</p>\n"));
    case HTTP_PRECONDITION_FAILED:
        return(apr_pstrcat(p,
                           "<p>The precondition on the request "
                           "for the URL ",
                           ap_escape_html(r->pool, r->uri),
                           " evaluated to false.</p>\n",
                           NULL));
    case HTTP_NOT_IMPLEMENTED:
        s1 = apr_pstrcat(p,
                         "<p>",
                         ap_escape_html(r->pool, r->method), " to ",
                         ap_escape_html(r->pool, r->uri),
                         " not supported.<br />\n",
                         NULL);
        return(add_optional_notes(r, s1, "error-notes", "</p>\n"));
    case HTTP_BAD_GATEWAY:
        s1 = "<p>The proxy server received an invalid" CRLF
            "response from an upstream server.<br />" CRLF;
        return(add_optional_notes(r, s1, "error-notes", "</p>\n"));
    case HTTP_VARIANT_ALSO_VARIES:
        return(apr_pstrcat(p,
                           "<p>A variant for the requested "
                           "resource\n<pre>\n",
                           ap_escape_html(r->pool, r->uri),
                           "\n</pre>\nis itself a negotiable resource. "
                           "This indicates a configuration error.</p>\n",
                           NULL));
    case HTTP_REQUEST_TIME_OUT:
        return("<p>Server timeout waiting for the HTTP request from the client.</p>\n");
    case HTTP_GONE:
        return(apr_pstrcat(p,
                           "<p>The requested resource<br />",
                           ap_escape_html(r->pool, r->uri),
                           "<br />\nis no longer available on this server "
                           "and there is no forwarding address.\n"
                           "Please remove all references to this "
                           "resource.</p>\n",
                           NULL));
    case HTTP_REQUEST_ENTITY_TOO_LARGE:
        return(apr_pstrcat(p,
                           "The requested resource<br />",
                           ap_escape_html(r->pool, r->uri), "<br />\n",
                           "does not allow request data with ",
                           ap_escape_html(r->pool, r->method),
                           " requests, or the amount of data provided in\n"
                           "the request exceeds the capacity limit.\n",
                           NULL));
    case HTTP_REQUEST_URI_TOO_LARGE:
        s1 = "<p>The requested URL's length exceeds the capacity\n"
             "limit for this server.<br />\n";
        return(add_optional_notes(r, s1, "error-notes", "</p>\n"));
    case HTTP_UNSUPPORTED_MEDIA_TYPE:
        return("<p>The supplied request data is not in a format\n"
               "acceptable for processing by this resource.</p>\n");
    case HTTP_RANGE_NOT_SATISFIABLE:
        return("<p>None of the range-specifier values in the Range\n"
               "request-header field overlap the current extent\n"
               "of the selected resource.</p>\n");
    case HTTP_EXPECTATION_FAILED:
        s1 = apr_table_get(r->headers_in, "Expect");
        if (s1)
            s1 = apr_pstrcat(p,
                     "<p>The expectation given in the Expect request-header\n"
                     "field could not be met by this server.\n"
                     "The client sent<pre>\n    Expect: ",
                     ap_escape_html(r->pool, s1), "\n</pre>\n",
                     NULL);
        else
            s1 = "<p>No expectation was seen, the Expect request-header \n"
                 "field was not presented by the client.\n";
        return add_optional_notes(r, s1, "error-notes", "</p>"
                   "<p>Only the 100-continue expectation is supported.</p>\n");
    case HTTP_UNPROCESSABLE_ENTITY:
        return("<p>The server understands the media type of the\n"
               "request entity, but was unable to process the\n"
               "contained instructions.</p>\n");
    case HTTP_LOCKED:
        return("<p>The requested resource is currently locked.\n"
               "The lock must be released or proper identification\n"
               "given before the method can be applied.</p>\n");
    case HTTP_FAILED_DEPENDENCY:
        return("<p>The method could not be performed on the resource\n"
               "because the requested action depended on another\n"
               "action and that other action failed.</p>\n");
    case HTTP_UPGRADE_REQUIRED:
        return("<p>The requested resource can only be retrieved\n"
               "using SSL.  The server is willing to upgrade the current\n"
               "connection to SSL, but your client doesn't support it.\n"
               "Either upgrade your client, or try requesting the page\n"
               "using https://\n");
    case HTTP_INSUFFICIENT_STORAGE:
        return("<p>The method could not be performed on the resource\n"
               "because the server is unable to store the\n"
               "representation needed to successfully complete the\n"
               "request.  There is insufficient free space left in\n"
               "your storage allocation.</p>\n");
    case HTTP_SERVICE_UNAVAILABLE:
        return("<p>The server is temporarily unable to service your\n"
               "request due to maintenance downtime or capacity\n"
               "problems. Please try again later.</p>\n");
    case HTTP_GATEWAY_TIME_OUT:
        return("<p>The gateway did not receive a timely response\n"
               "from the upstream server or application.</p>\n");
    case HTTP_NOT_EXTENDED:
        return("<p>A mandatory extension policy in the request is not\n"
               "accepted by the server for this resource.</p>\n");
    default:                    /* HTTP_INTERNAL_SERVER_ERROR */
        /*
         * This comparison to expose error-notes could be modified to
         * use a configuration directive and export based on that
         * directive.  For now "*" is used to designate an error-notes
         * that is totally safe for any user to see (ie lacks paths,
         * database passwords, etc.)
         */
        if (((error_notes = apr_table_get(r->notes,
                                          "error-notes")) != NULL)
            && (h1 = apr_table_get(r->notes, "verbose-error-to")) != NULL
            && (strcmp(h1, "*") == 0)) {
            return(apr_pstrcat(p, error_notes, "<p />\n", NULL));
        }
        else {
            return(apr_pstrcat(p,
                               "<p>The server encountered an internal "
                               "error or\n"
                               "misconfiguration and was unable to complete\n"
                               "your request.</p>\n"
                               "<p>Please contact the server "
                               "administrator at \n ",
                               ap_escape_html(r->pool,
                                              r->server->server_admin),
                               " to inform them of the time this "
                               "error occurred,\n"
                               " and the actions you performed just before "
                               "this error.</p>\n"
                               "<p>More information about this error "
                               "may be available\n"
                               "in the server error log.</p>\n",
                               NULL));
        }
        /*
         * It would be nice to give the user the information they need to
         * fix the problem directly since many users don't have access to
         * the error_log (think University sites) even though they can easily
         * get this error by misconfiguring an htaccess file.  However, the
         * e error notes tend to include the real file pathname in this case,
         * which some people consider to be a breach of privacy.  Until we
         * can figure out a way to remove the pathname, leave this commented.
         *
         * if ((error_notes = apr_table_get(r->notes,
         *                                  "error-notes")) != NULL) {
         *     return(apr_pstrcat(p, error_notes, "<p />\n", NULL);
         * }
         * else {
         *     return "";
         * }
         */
    }
}
  评论这张
 
阅读(2741)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017