问题描述
Hello数据表社区,
我正在尝试在S4类中公开data.table API,以便在PharmacoGx R软件包中使用。我已经阅读了data.table的源代码,似乎[
运算符是使用[.data.table
S3方法实现的。
这里是源代码为https://github.com/bhklab/PharmacoGx/blob/drug_combos/R/class-LongTable.R
的GitHub存储库的链接。我创建了自己的S3方法[.long.table
,并将S3类添加为“ LongTable” S4类的组成部分。
当前实现为:
'[.long.table' <- function(x,i,j) { eval(substitute(subset(x,j)) }
类似于[.data.table
中的操作,我使用subset.long.table
内部的替代项捕获i和j,以便稍后可以在正确的上下文中对它们进行评估,在这种情况下,它们位于数据内部。 rowData和colData插槽中的表格对象。
当我直接调用该方法时,一切正常:
`[.long.table`(longTable,cellLine1 == 'VCAP')
但是当我尝试使用[
作为运算符时,出现错误消息,找不到cellLine1
对象:
longTable[cellLine1 == 'VCAP']
经过反复试验,我确定此错误是在调用[.long.table
之前引发的。我认为这是在对[
通用方法的调用中,该方法试图在选择方法之前评估i
和j
参数。
data.table
的API有效,因此显然可以解决此问题。我希望有人能启发我,以便我可以为新的S4类使用此功能。
其他信息:
我的班级定义如下:
#' Define an S3 Class
#'
#' Allows use of S3 methods with new S4 class. This is required to overcome
#' limitations of the `[` S4 method.
#'
setOldClass('long.table')
#' LongTable class definition
#'
#' Define a private constructor method to be used to build a `LongTable` object.
#'
#' @param drugs [`data.table`]
#' @param cells [`data.table`]
#' @param assays [`list`]
#' @param metadata [`list`]
#'
#'
#' @return [`LongTable`] object containing the assay data from a
#'
#' @import data.table
#' @keywords internal
.LongTable <- setClass("LongTable",slots=list(rowData='data.table',colData='data.table',assays='list',metadata='list',.intern='environment'),contains='long.table')
#' LongTable constructor method
#'
#' @param rowData [`data.table`,`data.frame`,`matrix`] A table like object
#' coercible to a `data.table` containing the a unique `rowID` column which
#' is used to key assays,as well as additional row metadata to subset on.
#' @param rowIDs [`character`,`integer`] A vector specifying
#' the names or integer indexes of the row data identifier columns. These
#' columns will be pasted together to make up the row.names of the
#' `LongTable` object.
#' @param colData [`data.table`,`matrix`] A table like object
#' coercible to a `data.table` containing the a unique `colID` column which
#' is used to key assays,as well as additional column metadata to subset on.
#' @param colIDs [`character`,`integer`] A vector specifying
#' the names or integer indexes of the col data identifier columns. These
#' columns will be pasted together to make up the col.names of the
#' `LongTable` object.
#' @param assays A [`list`] containing one or more objects coercible to a
#' `data.table`,and keyed by rowID and colID corresponding to the rowID and
#' colID columns in colData and rowData.
#' @param metadata A [`list`] of metadata associated with the `LongTable`
#' object being constructed
#' @param keep.rownames [`logical` or `character`] Logical: whether rownames
#' should be added as a column if coercing to a `data.table`,default is FALSE.
#' If TRUE,rownames are added to the column 'rn'. Character: specify a custom
#' column name to store the rownames in.
#'
#' @return [`LongTable`] object
#'
#' @import data.table
#' @export
LongTable <- function(rowData,rowIDs,colData,colIDs,assays,metadata=list(),keep.rownames=FALSE) {
## TODO:: Handle missing parameters
if (!is(colData,'data.table')) {
colData <- data.table(colData,keep.rownames=keep.rownames)
}
if (!is(rowData,'data.table')) {
rowData <- data.table(rowData,keep.rownames=keep.rownames)
}
if (!all(vapply(assays,FUN=is.data.table,FUN.VALUE=logical(1)))) {
tryCatch({
assays <- lapply(assays,FUN=data.table,keep.rownames=keep.rownames)
},warning = function(w) {
warning(w)
},error = function(e,assays) {
message(e)
types <- lapply(assays,typeof)
stop(paste0('List items are types: ',paste0(types,collapse=','),'\nPlease ensure all items in the assays list are
coerced to data.tables!'))
})
}
# Initialize the .internals object to store private metadata for a LongTable
internals <- new.env()
## TODO:: Implement error handling
internals$rowIDs <-
if (is.numeric(rowIDs) && max(rowIDs) < ncol(rowData))
rowIDs
else
which(colnames(rowData) %in% rowIDs)
lockBinding('rowIDs',internals)
internals$colIDs <-
if (is.numeric(colIDs) && max(colIDs) < ncol(colData))
colIDs
else
which(colnames(colData) %in% colIDs)
lockBinding('colIDs',internals)
# Assemble the pseudo row and column names for the LongTable
.pasteColons <- function(...) paste(...,collapse=':')
rowData[,`:=`(.rownames=mapply(.pasteColons,transpose(.SD))),.SDcols=internals$rowIDs]
colData[,`:=`(.colnames=mapply(.pasteColons,.SDcols=internals$colIDs]
return(.LongTable(rowData=rowData,colData=colData,assays=assays,metadata=metadata,.intern=internals))
}
我有一个有效的subset
方法,使用S3和S4实现:
#' Subset method for a LongTable object.
#'
#' Allows use of the colData and rowData `data.table` objects to query based on
#' rowID and colID,which is then used to subset all value data.tables stored
#' in the dataList slot.
#'
#' This function is endomorphic,it always returns a LongTable object.
#'
#' @param x [`LongTable`] The object to subset.
#' @param rowQuery [`character`,`numeric`,`logical` or `expression`]
#' Character: pass in a character vector of drug names,which will subset the
#' object on all row id columns matching the vector.
#'
#' Numeric or Logical: these select based on the rowKey from the `rowData`
#' method for the `LongTable`.
#'
#' Expression: Accepts valid query statements to the `data.table` i parameter,#' this can be used to make complex queries using the `data.table` API
#' for the `rowData` data.table.
#'
#' @param columnQuery [`character`,which will subset the
#' object on all drug id columns matching the vector.
#'
#' Numeric or Logical: these select base don the rowID from the `rowData`
#' method for the `LongTable`.
#'
#' Expression: Accepts valid query statements to the `data.table` i parameter,#' this can be used to make complex queries using the `data.table` API
#' for the `rowData` data.table.
#'
#' @param values [`character`,`numeric` or `logical`] Optional list of value
#' names to subset. Can be used to subset the dataList column further,#' returning only the selected items in the new LongTable.
#'
#' @return [`LongTable`] A new `LongTable` object subset based on the specified
#' parameters.
#'
#' @importMethodsFrom BiocGenerics subset
#' @import data.table
#' @export
subset.long.table <- function(x,rowQuery,columnQuery,assays) {
longTable <- x
rm(x)
if (!missing(rowQuery)) {
if (tryCatch(is.character(rowQuery),error=function(e) FALSE)) {
select <- grep('^cellLine[:digit:]*',colnames(rowData(longTable)),value=TRUE)
rowQueryString <- paste0(paste0(select,' %in% ',.variableToCodeString(rowQuery)),collapse=' | ')
rowQuery <- str2lang(rowQueryString)
} else {
rowQuery <- substitute(rowQuery)
}
rowDataSubset <- rowData(longTable)[eval(rowQuery),]
} else {
rowDataSubset <- rowData(longTable)
}
if (!missing(columnQuery)) {
if (tryCatch(is.character(columnQuery),error=function(e) FALSE)) {
select <- grep('^drug[:digit:]*',colnames(colData(longTable)),value=TRUE)
columnQueryString <- paste0(paste0(select,.variableToCodeString(columnQuery)),collapse=' | ')
columnQuery <- str2lang(columnQueryString)
} else {
columnQuery <- substitute(columnQuery)
}
colDataSubset <- colData(longTable)[eval(columnQuery),]
} else {
colDataSubset <- colData(longTable)
}
rowKeys <- rowDataSubset$rowKey
colKeys <- colDataSubset$colKey
if (missing(assays)) { assays <- assayNames(longTable) }
keepAssays <- assayNames(longTable) %in% assays
assayData <- lapply(assays(longTable)[keepAssays],FUN=.filterLongDataTable,indexList=list(rowKeys,colKeys))
return(LongTable(colData=colDataSubset,[email protected]$colIDs,rowData=rowDataSubset,[email protected]$rowIDs,assays=assayData,metadata=metadata(longTable)))
}
#' S4 Method for subset.long.table
#'
#' @export
setMethod('subset','LongTable',subset.long.table)
我的sessionInfo:
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8
[5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8 LC_PAPER=en_CA.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] crayon_1.3.4 SummarizedExperiment_1.19.6 DelayedArray_0.15.7 matrixStats_0.56.0
[5] Matrix_1.2-18 Biobase_2.49.0 GenomicRanges_1.41.6 GenomeInfoDb_1.25.10
[9] IRanges_2.23.10 S4Vectors_0.27.12 BiocGenerics_0.35.4 data.table_1.13.0
loaded via a namespace (and not attached):
[1] lattice_0.20-41 bitops_1.0-6 grid_4.0.2 zlibbioc_1.35.0 XVector_0.29.3
[6] tools_4.0.2 RCurl_1.98-1.2 compiler_4.0.2 GenomeInfoDbData_1.2.3
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)