R - 如何基于相关矩阵而不是原始数据运行回归?

问题描述

我想根据相关矩阵而不是原始数据运行回归。我看过this post,但无法理解。我如何在 R 中执行此操作?

这是一些代码

#Correlation matrix.
MyMatrix <- matrix(
            c(1.0,0.1,0.5,0.4,1.0,0.9,0.3,1.0),nrow=4,ncol=4)

df <- as.data.frame(MyMatrix)

colnames(df)[colnames(df)=="V1"] <- "a"
colnames(df)[colnames(df)=="V2"] <- "b"
colnames(df)[colnames(df)=="V3"] <- "c"
colnames(df)[colnames(df)=="V4"] <- "d"

#Assume means and standard deviations as follows:
MEAN.a <- 4.00
MEAN.b <- 3.90
MEAN.c <- 4.10
MEAN.d <- 5.00
SD.a <- 1.01
SD.b <- 0.95
SD.c <- 0.99
SD.d <- 2.20

#Run model [UNSURE ABOUT THIS PART]
library(lavaan)
m1 <- 'd ~ a + b + c'
fit <- sem(m1,????)
summary(fit,standardize=TRUE)

解决方法

这应该可以。首先,您可以将相关矩阵转换为协方差矩阵

MyMatrix <- matrix(
  c(1.0,0.1,0.5,0.4,1.0,0.9,0.3,1.0),nrow=4,ncol=4)
rownames(MyMatrix) <- colnames(MyMatrix) <- c("a","b","c","d")

#Assume means and standard deviations as follows:
MEAN.a <- 4.00
MEAN.b <- 3.90
MEAN.c <- 4.10
MEAN.d <- 5.00
SD.a <- 1.01
SD.b <- 0.95
SD.c <- 0.99
SD.d <- 2.20
s <- c(SD.a,SD.b,SD.c,SD.d)
m <- c(MEAN.a,MEAN.b,MEAN.c,MEAN.d)
cov.mat <- diag(s) %*% MyMatrix %*% diag(s)
rownames(cov.mat) <- colnames(cov.mat) <- rownames(MyMatrix)
names(m) <- rownames(MyMatrix)

然后,您可以使用 lavaan 沿着您在问题中提到的帖子的路线估计模型。请注意,您需要提供许多观察值才能获得样本估计值。我在示例中使用了 100,但如果这没有意义,您可能需要更改它。

library(lavaan)
m1 <- 'd ~ a + b + c'
fit <- sem(m1,sample.cov = cov.mat,sampl.nobs=100,sample.mean=m
           meanstructure=TRUE)
summary(fit,standardize=TRUE)
# lavaan 0.6-6 ended normally after 44 iterations
# 
# Estimator                                         ML
# Optimization method                           NLMINB
# Number of free parameters                          5
# 
# Number of observations                           100
# 
# Model Test User Model:
#   
# Test statistic                                 0.000
# Degrees of freedom                                 0
# 
# Parameter Estimates:
#   
# Standard errors                             Standard
# Information                                 Expected
# Information saturated (h1) model          Structured
# 
# Regressions:
#                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
# d ~                                                                   
#   a                 6.317    0.095   66.531    0.000    6.317    2.900
#   b                12.737    0.201   63.509    0.000   12.737    5.500
#   c               -13.556    0.221  -61.307    0.000  -13.556   -6.100
# 
# Intercepts:
#                 Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
# .d               -14.363    0.282  -50.850    0.000  -14.363   -6.562
# 
# Variances:
#                 Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
# .d                 0.096    0.014    7.071    0.000    0.096    0.020
# 
#