从互相关到 R 中的列格式

问题描述

我有以下互相关矩阵:

df =

   A B C
A  1 7 1
B  7 1 9
C  1 9 1

我想把它变成以下格式:

A B 7
A C 1
B C 9

任何简单的 R 代码可以做这样的事情?

解决方法

不是简单的方法,而是基础 R 中的一个选项:

library(mlr3)
library(mlr3db)

# base data set
data = iris
data$row_id = 1:nrow(data)

# create data base with two tables,split data into features and target and
# keep key column `row_id` in both tables
path = tempfile()
con = DBI::dbConnect(RSQLite::SQLite(),dbname = path)
DBI::dbWriteTable(con,"features",subset(data,select = - Species))
DBI::dbWriteTable(con,"target",select = c(row_id,Species)))
DBI::dbDisconnect(con)

# re-open table
con = DBI::dbConnect(RSQLite::SQLite(),dbname = path)

# access tables with dplyr
tbl_features = dplyr::tbl(con,"features")
tbl_target = dplyr::tbl(con,"target")

# join tables with an inner_join
tbl_joined = dplyr::inner_join(tbl_features,tbl_target,by = "row_id")

# convert to a backend and create the task
backend = as_data_backend(tbl_joined,primary_key = "row_id")
mlr3::TaskClassif$new("my_task",backend,target = "Species")

数据

mat[upper.tri(mat,diag = TRUE)] <- NA
tmp <- which(!is.na(mat),arr.ind = TRUE)
data.frame(col = colnames(mat)[tmp[,2]],row = rownames(tmp),val = mat[tmp])

#  col row val
#1   A   B   7
#2   A   C   1
#3   B   C   9
,

另一个基本的 R 选项

inds <- which(col(mat) < row(mat),arr.ind = TRUE)
data.frame(
  col = colnames(mat)[inds[,"col"]],row = rownames(mat)[inds[,"row"]],val = mat[inds]
)

给予

  col row val
1   A   B   7
2   A   C   1
3   B   C   9