问题描述
上传后,我想逐行阅读 .docx文件。
我的 file.docx 被划分为章
的章和段file.docx
的结构Chapter 1 - Events
alert or disservices
significant activities
Chapter 2 – Safety
near miss
security checks
Chapter 3 – Training
environment
upkeep
我尝试使用Microsoft.Office.Interop.Word来阅读文档。
整个文档
现在根据章节,我必须在相应的数据库表中插入章节和该段落的内容
例如
Chapter 1 - Events
- alert or disservices
Lorem ipsum dolor sit amet,consectetur adipiscing elit ….
…. ….
…. ….
- significant activities
Phasellus dui nunc,rutrum vitae dictum eleifend,ullamcorper hendrerit sem ….
…. ….
…. ….
必须插入表Events
-- ----------------------------
-- Table structure for events
-- ----------------------------
DROP TABLE IF EXISTS `events`;
CREATE TABLE `events` (
`sID` int(11) NOT NULL AUTO_INCREMENT,`alert_or_disservices` longtext,`significant_activities` longtext,PRIMARY KEY (`sID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
请你能帮我吗?
在此先感谢您的帮助或建议
我的下面的代码
protected void Page_Load(object sender,EventArgs e)
{
if (!IsPostBack)
{
Application word = new Application();
object miss = Missing.Value;
object path = @"C:\\file.docx";
object readOnly = true;
Document docs = word.Documents.Open(ref path,ref miss,ref readOnly,ref miss);
string totaltext = ""; //the whole document
for (int i = 0; i < docs.Paragraphs.Count; i++)
{
totaltext += docs.Paragraphs[i + 1].Range.Text.ToString() + "<br />";
}
Response.Write(totaltext);
docs.Close();
word.Quit();
}
}
更新#1
- 标题可识别的章节
- 警报或服务中断之前仅带有文本连字符
- 每个新段落均以文本连字符开头
- 警报块中不存在硬性返回/段落标记
- 我为每一章创建了一个表格,各列的标题与各段的标题相同,但是如果有更好的解决方案,欢迎您
我想共享.docx文件供您下载,但我不知道如何。
我尝试使用wetransfer
,但由于它是不受信任的来源而未被批准
更新#2
protected void Page_Load(object sender,EventArgs e)
{
if (!IsPostBack)
{
var wdApp = new Microsoft.Office.Interop.Word.Application();
var doc = wdApp.Documents.Open(@"C:\\file.docx");
var ran = doc.Content;
var fin = ran.Find;
fin.ClearFormatting();
fin.MatchWildcards = false;
fin.Text = "";
fin.set_Style("Chapter 1 - Events"); //use your heading style here,e.g. Heading 1
fin.Execute();
while (fin.Found)
{
var chap = ran.Text;
//cut off "Chapter[space]" from start,clean text from trailing carriage returns and stuff
chap = chap.Substring(8).TrimEnd('\r','\n','\t',' ');
//Heading ended by hard return/para mark; get text of following paragraph '-alert or disservice'
ran = doc.Range(ran.End,ran.End).Paragraphs[1].Range;
var subhead = ran.Text;
//clean subheading of leading hyphen and space,trailing stuff
subhead = subhead.TrimStart(' ','-').TrimEnd('\r',' ');
//get text under subheading = contents,clean up
ran = doc.Range(ran.End,ran.End).Paragraphs[1].Range;
var contents = ran.Text;
contents = contents.TrimEnd('\r',' ');
//write to db
string constr = ConfigurationManager.ConnectionStrings["cn"].ConnectionString;
string strSql = @"INSERT INTO Chapters (chapter,subheading,contents) VALUES (?,?,?)";
using (MySqlConnection con = new MySqlConnection(constr))
{
using (MySqlCommand cmd = new MySqlCommand(strSql))
{
con.Open();
cmd.Parameters.AddWithValue("param1",chap);
cmd.Parameters.AddWithValue("param2",subhead);
cmd.Parameters.AddWithValue("param3",contents);
cmd.ExecuteNonQuery();
con.Close();
}
}
ran = doc.Range(ran.End,doc.Content.End);
fin = ran.Find;
fin.ClearFormatting();
fin.MatchWildcards = false;
fin.Text = "";
fin.set_Style("Chapter 1 - Events"); //use your heading style here,e.g. Heading 1
fin.Execute();
}
doc.Close(false);
wdApp.Quit();
}
}
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)