克隆的奇怪行为

这是一个相当简单的应用程序,它使用clone()调用创建一个轻量级进程(线程).

#define _GNU_SOURCE

#include


结果如下：
运行1：

➜  LFD401 ./clone
I am main,pid 10974
I am calling clone
I created child with pid: 10975
Main done,pid 10974
I am func,pid 10975

运行2：

➜  LFD401 ./clone
I am main,pid 10995
I am calling clone
I created child with pid: 10996
I created child with pid: 10996
I am func,pid 10996
Main done,pid 10995

运行3：

➜  LFD401 ./clone
I am main,pid 11037
I am calling clone
I created child with pid: 11038
I created child with pid: 11038
I am func,pid 11038
I created child with pid: 11038
I am func,pid 11038
Main done,pid 11037

运行4：

➜  LFD401 ./clone
I am main,pid 11062
I am calling clone
I created child with pid: 11063
Main done,pid 11062
Main done,pid 11062
I am func,pid 11063

这里发生了什么？为什么“我创造孩子”的信息有时会被打印几次？
此外,我注意到克隆调用后添加延迟“修复”了问题.


最佳答案
您有竞争条件(即)您没有stdio隐含的线程安全性.
问题更严重.您可以获得重复的“func”消息.
问题是使用clone与pthread_create没有相同的保证. (即)您没有获得printf的线程安全变体.
我不确定,但是,IMO关于stdio流和线程安全的措辞在实践中仅适用于使用pthreads.
所以,你必须处理你自己的线程锁定.
以下是重新编码为使用pthread_create的程序版本.它似乎没有发生任何事故：

#define _GNU_SOURCE

#include 

这是我用来检查错误的测试脚本[这有点粗糙,但应该没问题].针对您的版本运行,它将很快中止. pthread_create版似乎传递得很好

#!/usr/bin/perl
# clonetest -- clone test
#
# arguments:
#   "-p0" -- suppress check for duplicate parent messages
#   "-c0" -- suppress check for duplicate child messages
#   1 -- base name for program to test (e.g. for xyz.c,use xyz)
#   2 -- [optional] number of test iterations (DEFAULT: 100000)

master(@ARGV);
exit(0);

# master -- master control
sub master
{
    my(@argv) = @_;
    my($arg,$sym);

    while (1) {
        $arg = $argv[0];
        last unless (defined($arg));

        last unless ($arg =~ s/^-(.)//);
        $sym = $1;

        shift(@argv);

        $arg = 1
            if ($arg eq "");

        $arg += 0;
        ${"opt_$sym"} = $arg;
    }

    $opt_p //= 1;
    $opt_c //= 1;
    printf("clonetest: p=%d c=%d\n",$opt_p,$opt_c);

    $xfile = shift(@argv);
    $xfile //= "clone1";
    printf("clonetest: xfile='%s'\n",$xfile);

    $itermax = shift(@argv);
    $itermax //= 100000;
    $itermax += 0;
    printf("clonetest: itermax=%d\n",$itermax);

    system("cc -o $xfile -O2 $xfile.c -lpthread");
    $code = $? >> 8;
    die("master: compile error\n")
        if ($code);

    $logf = "/tmp/log";

    for ($iter = 1;  $iter <= $itermax;  ++$iter) {
        printf("iter: %d\n",$iter)
            if ($opt_v);
        dotest($iter);
    }
}

# dotest -- perform single test
sub dotest
{
    my($iter) = @_;
    my($parcnt,$cldcnt);
    my($xfsrc,$bf);

    system("./$xfile > $logf");

    open($xfsrc,"<$logf") or
        die("dotest: unable to open '$logf' -- $!\n");

    while ($bf = <$xfsrc>) {
        chomp($bf);

        if ($opt_p) {
            while ($bf =~ /created/g) {
                ++$parcnt;
            }
        }

        if ($opt_c) {
            while ($bf =~ /func/g) {
                ++$cldcnt;
            }
        }
    }

    close($xfsrc);

    if (($parcnt > 1) or ($cldcnt > 1)) {
        printf("dotest: fail on %d -- parcnt=%d cldcnt=%d\n",$iter,$parcnt,$cldcnt);
        system("cat $logf");
        exit(1);
    }
}

更新：


Were you able to recreate OPs problem with clone?

绝对.在我创建pthreads版本之前,除了测试OP的原始版本之外,我还创建了以下版本：
(1)将setlinebuf添加到main的开头
(2)在克隆和__fpurge之前添加fflush作为func的第一个语句
(3)在返回0之前在func中添加了fflush
版本(2)消除了重复的父消息,但重复的子消息仍然存在
如果您想亲眼看到这个,请从问题,我的版本和测试脚本中下载OP的版本.然后,在OP的版本上运行测试脚本.
我发布了足够的信息和文件,以便任何人都可以重新创建问题.
请注意,由于我的系统和OP之间的差异,我不能在3-4次尝试时重现问题.所以,这就是我创建脚本的原因.
该脚本执行100,000次测试运行,通常问题将在5000-15000内表现出来.

c

克隆的奇怪行为

相关文章