CentOS8实时FIFO进程未接收到XCPU信号

问题描述

背景

我的脑海中有一个非常精确的程序，因此需要使用实时调度策略。该程序应该做什么是不可能的。在编写实际程序之前，我想先测试一下一些东西-实时和“正常”调度策略之间的一些差异，例如（例如，我注意到nanosleep使用实时FIFO的精度要高10倍）调度和99个优先级），当然还有SIGXcpu信号，如果信号超过一次性软允许的cpu时间，则应将其发送到实时进程。即使我的进程正在无限循环地消耗cpu时间，我也没有收到它。

环境

我正在使用vultr上托管的CentOS8-1核，512MB RAM，最新版本的内核和每个软件包。如果没有运行我的实时进程，top将显示2-3个进程正在运行（systemd是主要进程），并且有80多个正在休眠。所有活动进程似乎都具有正常的调度策略，优先级设置为20，这是最低的。
我的实时流程的代码如下：

#define _GNU_SOURCE

// most of these are useless yes,were used before for testing
// and I just did not care to remove them,but that shouldn't change anything,right

#include <sched.h>
#include <unistd.h>
#include <sys/types.h>

#include <stdio.h>
#include <stdatomic.h>
#include <signal.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <errno.h>

static uint64_t GetTimeoutTime(const uint64_t nanoseconds) {
  struct timespec tp = { .tv_sec = 0,.tv_nsec = 0 };
  (void) clock_gettime(CLOCK_MONOTONIC,&tp);
  return (uint64_t)(tp.tv_sec) * 1000000000 + (uint64_t)(nanoseconds / 1000000000) * 1000000000 + (uint64_t)(tp.tv_nsec) + nanoseconds - (uint64_t)(nanoseconds / 1000000000) * 1000000000;
}

static int siga(int signum,void (*handler)(int)) {
  return sigaction(signum,&((struct sigaction){ .sa_handler = handler,.sa_flags = 0,.sa_mask = 0 }),NULL);
}

static void xcpu(int sig) {
  puts("xcpu called");
  (void) pthread_yield(); // to not get killed
}

int main() {
  uint64_t g = 0;
  uint64_t t1,t2;
  int err = siga(SIGXcpu,xcpu);
  if(err != 0) {
    puts("e");
    printf("%d\n",err);
  }
  if(sched_setscheduler(getpid(),SCHED_FIFO,&((struct sched_param){ .sched_priority = 99 })) != 0) {
    puts("err");
  }
  while(1) {
    t1 = GetTimeoutTime(0); // just some stuff to make the process busy
    t2 = GetTimeoutTime(0) - t1; // I was using this code before,thus left it there
    g += t2;
  }
  printf("avg %lf\n",(double)(g) / 10000.0); // just to make it seem as g is not useless
  return 0;
}

结果是-该进程持续占用90％以上的cpu，正在运行，并且一直未收到任何信号。我实际上使程序运行了大约15分钟，并且什么也没发生-该过程没有被杀死。我的意思是，FIFO调度不应该在线程运行时删除它们，对吗？那就是Round Robin的工作，所以我不太了解会导致这种现象的原因。我的线程是否在我不知情的情况下进入睡眠状态？
将截止时间设置为2 ^ 63-（1、2、3）个数字的DEADLINE调度是否比当前FIFO解决方案更好？我只是想大部分时间为自己获取大部分cpu，因为除了我自己的进程外，什么也不会使用cpu（唯一的区别是实时调度策略会带来一些好处，其中之一我在一开始就注意到并描述了-纳米睡眠的精度提高了。还有其他好处吗？）

解决方法

好的，我找到了答案。
问题在于RTIME的软硬限制。我认为默认情况下它们是很低的，但是现在我再次检查以确保。软限制和硬限制均为2 ^ 63。将软限制降低到1e6，将硬限制降低到1e7之后，我的进程开始每秒接收XCPU信号。使用getrlimit和setrlimit函数进行检查和降低（有关更多信息，请参见man。）。

c centos8 real-time