添加一列,其中包含 R 中某个特定列的缺失值

问题描述

我正在使用 R。我得到了数据集中特定列的缺失值,我需要将它们添加到我的主数据中。

我的数据看起来像这样...

A           B    C    D    G
Joseph      5    2.1  6.0  7.8
Juan        NA   3.0  3.5  3.8
Miguel      2    4.0  2.0  2.5
Steven      NA   6.0  5.0  0.2
Jennifer    NA   0.1  5.0  7.0
emma        8.0  8.1  8.3  8.5

所以,不,我有 B 列中缺失值的数据

A          B
Juan       3.0
Steven     2.5
Jennifer   4.4

我需要将它们添加到我的主要数据中。我尝试使用 tidyverse 中的 coalesce 函数,但我无法得到正确的结果。

解决方法

一种选择可能是:

df %>%
 mutate(B = if_else(is.na(B),df2$B[match(A,df2$A)],B))

         A   B   C   D   G
1   Joseph 5.0 2.1 6.0 7.8
2     Juan 3.0 3.0 3.5 3.8
3   Miguel 2.0 4.0 2.0 2.5
4   Steven 2.5 6.0 5.0 0.2
5 Jennifer 4.4 0.1 5.0 7.0
6     Emma 8.0 8.1 8.3 8.5
,

这行得通吗:

df
# A tibble: 6 x 5
  A            B     C     D     G
  <chr>    <dbl> <dbl> <dbl> <dbl>
1 Joseph       5   2.1   6     7.8
2 Juan        NA   3     3.5   3.8
3 Miguel       2   4     2     2.5
4 Steven      NA   6     5     0.2
5 Jennifer    NA   0.1   5     7  
6 Emma         8   8.1   8.3   8.5
dd
# A tibble: 3 x 2
  A            B
  <chr>    <dbl>
1 Juan       3  
2 Steven     2.5
3 Jennifer   4.4
df$B[match(dd$A,df$A)] <- dd$B
df
# A tibble: 6 x 5
  A            B     C     D     G
  <chr>    <dbl> <dbl> <dbl> <dbl>
1 Joseph     5     2.1   6     7.8
2 Juan       3     3     3.5   3.8
3 Miguel     2     4     2     2.5
4 Steven     2.5   6     5     0.2
5 Jennifer   4.4   0.1   5     7  
6 Emma       8     8.1   8.3   8.5
,

您可以连接两个数据框并使用 # so useless registration system # Registration Phase name = input("Type a username: ") password = input("Type a password: ") confrim_password = input("Type your password again: ") eMail = input("Type youe e-mail Adress: ") # Password checking # Reading the txt file user_informations = [] with open("userInfo.txt","r") as file_info: for line in file_info.readlines(): line = line.replace("\n","") user_informations.append(line) # storing informations to a list result = f"{name} | {password} | {confrim_password} | {eMail}" user_informations.append(result) # storing informations to a txt file with open("userInfo.txt","w+") as file_info: for line in user_informations: file_info.write(f"{line}\n") # this is just for testing print(user_informations) 作为 coalesce 值。

B

或在基数 R 中:

library(dplyr)

df1 %>%
  left_join(df2,by = 'A') %>%
  mutate(B = coalesce(B.x,B.y)) %>%
  select(names(df1))

#         A   B   C   D   G
#1   Joseph 5.0 2.1 6.0 7.8
#2     Juan 3.0 3.0 3.5 3.8
#3   Miguel 2.0 4.0 2.0 2.5
#4   Steven 2.5 6.0 5.0 0.2
#5 Jennifer 4.4 0.1 5.0 7.0
#6     Emma 8.0 8.1 8.3 8.5
,

您可以连接数据,然后在 B 列上应用 NA 值的值。


# your original data with missing value in column B
data

# data that contain data to fill into column B
additional_data

library(dplyr)
merged_data <- left_join(data,additional_data,by = "A",suffix = c("","_additional"))

merged_data %>% mutate(B = if_else(is_na(B),B_additional,B)) %>%
  select(-B_additional)

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...