一起使用汇总、交叉和分位数函数

问题描述

我正在尝试使用 5282 NaN in dataframe 0 NaN after fill 103552 train examples 25888 validation examples 32361 test examples Epoch 1/3 WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor,but we receive a <class 'dict'> input: {'year': <tf.Tensor 'ExpandDims_14:0' shape=(None,1) dtype=int32>,'month': <tf.Tensor 'ExpandDims_11:0' shape=(None,'day': <tf.Tensor 'ExpandDims_3:0' shape=(None,'dep_time': <tf.Tensor 'ExpandDims_5:0' shape=(None,1) dtype=float32>,'dep_delay': <tf.Tensor 'ExpandDims_4:0' shape=(None,'arr_time': <tf.Tensor 'ExpandDims_1:0' shape=(None,'arr_delay': <tf.Tensor 'ExpandDims:0' shape=(None,'carrier': <tf.Tensor 'ExpandDims_2:0' shape=(None,1) dtype=string>,'tailnum': <tf.Tensor 'ExpandDims_13:0' shape=(None,'flight': <tf.Tensor 'ExpandDims_8:0' shape=(None,'origin': <tf.Tensor 'ExpandDims_12:0' shape=(None,'dest': <tf.Tensor 'ExpandDims_6:0' shape=(None,'distance': <tf.Tensor 'ExpandDims_7:0' shape=(None,'hour': <tf.Tensor 'ExpandDims_9:0' shape=(None,'minute': <tf.Tensor 'ExpandDims_10:0' shape=(None,1) dtype=float32>} Consider rewriting this model with the Functional API. WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor,1) dtype=float32>} Consider rewriting this model with the Functional API. 3232/3236 [============================>.] - ETA: 0s - loss: 497.8120 - mae: 13.8204WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor,1) dtype=float32>} Consider rewriting this model with the Functional API. 3236/3236 [==============================] - 22s 6ms/step - loss: 497.4619 - mae: 13.8162 - val_loss: 99.0488 - val_mae: 6.2621 Epoch 2/3 3236/3236 [==============================] - 20s 6ms/step - loss: 197.7995 - mae: 9.6854 - val_loss: 80.7915 - val_mae: 5.3355 Epoch 3/3 3236/3236 [==============================] - 21s 6ms/step - loss: 179.8991 - mae: 9.1736 - val_loss: 86.6206 - val_mae: 5.6779 1012/1012 [==============================] - 2s 2ms/step - loss: 98.2659 - mae: 5.6766 MeanAbsoluteError 5.676607608795166 数据集来计算汇总统计数据。这是我的代码 -

mtcars

但我收到以下错误 -

错误：df <- as_tibble(mtcars) df.sum2 <- df %>% select(mpg,cyl,vs,am,gear,carb) %>% mutate(across(where(is.factor),as.numeric)) %>% summarise(across( .cols = everything(),.fns = list( Min = min,Q25 = quantile (.,0.25),Median = median,Q75 = quantile (.,0.75),Max = max,Mean = mean,StdDev = sd,N = n() ),na.rm = T,.names = "{col}_{fn}" ) ) 输入 summarise() 有问题。 x 不能对不存在的列进行子集化。 x 位置 65、66、69、71、76 等不存在。 i 只有 6 列。 i 输入 ..1 是 ..1。

如果我从上面的代码中取出 across(...) 和 Q25 = quantile (.,0.25)，它工作正常。实际上，我可以使用以下代码获得预期的结果 -

Q75 = quantile (.,0.75)

但我想将 df.sum <- df %>% select(mpg,carb) %>% # select variables to summarise summarise_each(funs(Min = min,N = n())) 函数与 across 函数一起使用。我不想使用 summarise 函数。

解决方法

您需要在传递其他参数时使用匿名函数或公式语法。试试

library(dplyr)

df.sum2 <- df %>%
  select(mpg,cyl,vs,am,gear,carb) %>% 
  mutate(across(where(is.factor),as.numeric)) %>% 
  summarise(across(
    .cols = everything(),.fns = list(
      Min = min,Q25 = ~quantile(.,0.25),Median = median,Q75 = ~quantile(.,0.75),Max = max,Mean = mean,StdDev = sd,N = ~n()
    ),.names = "{col}_{fn}"
  )
  )

across dplyr r

一起使用汇总、交叉和分位数函数

问题描述

解决方法

相关问答