问题描述
此功能predORF出现问题
sqa <- c("CAGGGCACCTGGCCTTGGGATGCGCCTCCTGCCCGCTGAGCCCAGGGGCCGCTATGGCCCTTCTGGCCATGCTGGCGCTGCAGACAGCTCTCTACCTAGTAGGCTTCTTCTACCCGCCGGGAGGCATATGGCGCTGGATCACCCGGGAC")
print(sqa)
首先,我将序列转换为输入格式
dnastring = DNAString(sqa)
print(dnastring)
dna <- DNAStringSet(sqa)
print(dna)
当我使用DNAstrings格式时:
predORF(dnabase,n = 1,type = "grl",mode = "orf",strand = "sense",longest_disjoint=FALSE,startcodon = c("ATG"),stopcodon = c("TAA"))
我收到此错误
Error in predORF(dnastring,:
Sequence name slot of x need be populated with unique names.
当我定义它的长度时,我得到了:
predORF(dna[:149],stopcodon = c("TAA"))
Error in predORF(dnastring[1:149],:
Sequence name slot of x need be populated with unique names.
当我使用DNAStringSet格式时:
predORF(dna,stopcodon = c("TAA"))
我收到此错误
Error in predORF(dna n = 1,stopcodon = c("TAA"))
Error: subscript contains out-of-bounds indices
我该如何解决这个问题?
谢谢!
解决方法
问题在于predORF需要命名序列作为输入。见下文:
library(Biostrings)
library(systemPipeR)
sqa <- c("ATGTAA")
sqb <- c("ATGGCCTAA")
dna <- DNAStringSet(c(seq_a = sqa,seq_b = sqb),use.names = T)
predORF(dna,n = 1,type = "grl",mode = "orf",strand = "sense",longest_disjoint=FALSE,startcodon = c("ATG"),stopcodon = c("TAA"))