在numpy数组中强制时用默认值替换错误

问题描述

我试图将 NumPy 数组中的值强制为浮动。但是，在我的数组中，可能有一些值可能无法成功强制，我想用默认值替换这些值。不过，我确实想要 NumPy 的速度。我不想做 python 循环。实现这种行为的最佳途径是什么？

例如：

import numpy as np
my_array = np.array(["1","2","3","NA"])
new_array = magic_coerce(my_array,float,-1.0) # I want to implement this
print (new_array) # should print [1.,2.,3. -1.]

我正在尝试用 c 编写自己的 ufunc，我有以下内容：


int is_float(const char* c)
{
    regex_t regex;
    regcomp(&regex,"^[+-]?([0-9]*[.])?[0-9]+$",REG_EXTENDED);
    return regexec(&regex,c,NULL,0) == 0; 
}

float to_float(const char *c,float default_value)
{
    float result = default_value;
    if (is_float(c))
    {
        result = atof(c);
    }
    return result;
}


static PyMethodDef LogitMethods[] = {
        {NULL,NULL}
};

/* The loop deFinition must precede the PyMODINIT_FUNC. */

static void double_logitprod(char **args,npy_intp *dimensions,npy_intP* steps,void* data)
{
    npy_intp i;
    npy_intp n = dimensions[0];
    char *in1 = args[0],*in2 = args[1];
    char *out = args[2];
    npy_intp in1_step = steps[0]; 
    npy_intp out_step = steps[2];

    double tmp;

    for (i = 0; i < n; i++) {
        /*BEGIN main ufunc computation*/
        char *tmp1 = (char *) in1;
        tmp = *((double *)in2);
        *((double *) out) = to_float(tmp1,tmp);
        /*END main ufunc computation*/

        in1 += in1_step;
        out += out_step;
    }
}


/*This a pointer to the above function*/
PyUFuncGenericFunction funcs[1] = {&double_logitprod};

/* These are the input and return dtypes of logit.*/

static char types[3] = {NPY_OBJECT,NPY_DOUBLE,NPY_DOUBLE};

但它似乎无法正常工作。 numpy 中 UNICODE 的类型是什么？ NPY_UNICODE 给出了一个错误，所以我将它强制转换为 NPY_OBJECT，但这似乎不起作用。

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

c numpy numpy-ufunc pandas pandas python