使用Word Interop在c#中以100%的保真度将文档内容包括格式和页面格式复制到另一文档

问题描述

我想将用户创建的文档内容复制到现有文档中。现有文档的内容必须与用户创建的文档完全相同。

我不能简单地使用System.IO复制文件或使用Word Interop中的SaveAs方法保存用户创建的文档副本。这是因为现有文档是从Web服务器生成的文档,并且具有用于将其上传回服务器的VBA模块。

由网络服务器生成的文档(现有文档)是Word 2003文档,但用户创建的文档是Word 2003文档或Word 2007 +。

考虑到这些限制,我首先创建了以下方法

string tempsave = //location of user created document;
string savelocation = //location of existing document;
Word.Application objWordOpen = new Word.Application();
Document doclocal = objWordOpen.Documents.Open(tempsave);
Document d1 = objWordOpen.Documents.Open(savelocation);
Word.Range oRange = doclocal.Content;
oRange.copy();
d1.Activate();
d1.UpdateStyles();
d1.ActiveWindow.Selection.WholeStory();
d1.ActiveWindow.Selection.PasteAndFormat(Word.WdRecoveryType.wdFormatOriginalFormatting);

这通常是有效的。但是,表被弄乱了。

此外,如果有@R_302_6404@,则输出也不同。

用户创建的文档:

enter image description here

输出-现有文档:

enter image description here

此外,在文档末尾添加一个段落标记,如下所示:

用户创建的文档:

enter image description here

输出-现有文档:

enter image description here

页面格式也被弄乱了,输出设置了镜像边距。

用户创建的文档:

enter image description here

输出-现有文档:

enter image description here

我也尝试过使用Range.Insert()方法并按此处https://stackoverflow.com/a/54500605/10468231所述设置范围而不进行复制,但是我仍然遇到这些问题。

我也曾尝试将VBA模块添加到文档中,但是也有文档变量和其他自定义属性,我不想弄乱正在上传到服务器的文件。 我该如何处理这些问题?这两个文档都基于normal模板。

我愿意就该主题提出另一条建议,但是我知道.doc文件的处理不如.docx格式那么容易,这就是为什么我认为我对COM Interop感到困惑。

谢谢。

更新 根据Charles Kenyon发布的Macropod代码,我设法将更多格式从源复制到目标。尽管如此,@R_302_6404@还是有区别的-段落标记位于新页上,而不是同一页上。 此外,即使“字体大小”相同,文本也会稍大。

            Word.Range oRange;
            oRange = Source.Content;
            Target.Content.FormattedText = oRange.FormattedText;
            LayoutTransfer(Source,Target);

LayoutTransfer方法

private void LayoutTransfer(Document source,Document target)
        {
            float sPageHght;
            float sPageWdth;
            float sHeaderdist;
            float sFooterdist;
            float sTMargin;
            float sBMargin;
            float sLMargin;
            float sRMargin;
            float sGutter;
            WdGutterStyle sGutterPos;
            WdPaperSize lPaperSize;
            WdGutterStyleOld lGutterStyle;
            int lMirrorMargins;
            WdVerticalAlignment lVerticalAlignment;
            WdSectionStart lScnStart;
            WdSectionDirection lScnDir;
            int lOddEvenHdFt;
            int lDiffFirstHdFt;
            bool bTwoPagesOnOne;
            bool bBkFldPrnt;
            int bBkFldPrnShts;
            bool bBkFldRevPrnt;
            WdOrientation lOrientation;
            foreach (Word.Section section in source.Sections)
            {
                lPaperSize = section.PageSetup.PaperSize;
                lGutterStyle = section.PageSetup.GutterStyle;
                lOrientation = section.PageSetup.Orientation;
                lMirrorMargins = section.PageSetup.MirrorMargins;
                lScnStart = section.PageSetup.SectionStart;
                lScnDir = section.PageSetup.SectionDirection;
                lOddEvenHdFt = section.PageSetup.OddAndEvenPagesheaderfooter;
                lDiffFirstHdFt = section.PageSetup.DifferentFirstPageheaderfooter;
                lVerticalAlignment = section.PageSetup.VerticalAlignment;
                sPageHght = section.PageSetup.PageHeight;
                sPageWdth = section.PageSetup.PageWidth;
                sTMargin = section.PageSetup.TopMargin;
                sBMargin = section.PageSetup.BottomMargin;
                sLMargin = section.PageSetup.LeftMargin;
                sRMargin = section.PageSetup.RightMargin;
                sGutter = section.PageSetup.Gutter;
                sGutterPos = section.PageSetup.GutterPos;
                sHeaderdist = section.PageSetup.Headerdistance;
                sFooterdist = section.PageSetup.Footerdistance;
                bTwoPagesOnOne = section.PageSetup.TwoPagesOnOne;
                bBkFldPrnt = section.PageSetup.BookFoldPrinting;
                bBkFldPrnShts = section.PageSetup.BookFoldPrintingSheets;
                bBkFldRevPrnt = section.PageSetup.BookFoldRevPrinting;

                var index = section.Index;


                target.Sections[index].PageSetup.PaperSize = lPaperSize;
                target.Sections[index].PageSetup.GutterStyle = lGutterStyle;
                target.Sections[index].PageSetup.Orientation = lOrientation;
                target.Sections[index].PageSetup.MirrorMargins = lMirrorMargins;
                target.Sections[index].PageSetup.SectionStart = lScnStart;
                target.Sections[index].PageSetup.SectionDirection = lScnDir;
                target.Sections[index].PageSetup.OddAndEvenPagesheaderfooter = lOddEvenHdFt;
                target.Sections[index].PageSetup.DifferentFirstPageheaderfooter = lDiffFirstHdFt;
                target.Sections[index].PageSetup.VerticalAlignment = lVerticalAlignment;
                target.Sections[index].PageSetup.PageHeight = sPageHght;
                target.Sections[index].PageSetup.PageWidth = sPageWdth;
                target.Sections[index].PageSetup.TopMargin = sTMargin;
                target.Sections[index].PageSetup.BottomMargin = sBMargin;
                target.Sections[index].PageSetup.LeftMargin = sLMargin;
                target.Sections[index].PageSetup.RightMargin = sRMargin;
                target.Sections[index].PageSetup.Gutter = sGutter;
                target.Sections[index].PageSetup.GutterPos = sGutterPos;
                target.Sections[index].PageSetup.Headerdistance = sHeaderdist;
                target.Sections[index].PageSetup.Footerdistance = sFooterdist;
                target.Sections[index].PageSetup.TwoPagesOnOne = bTwoPagesOnOne;
                target.Sections[index].PageSetup.BookFoldPrinting = bBkFldPrnt;
                target.Sections[index].PageSetup.BookFoldPrintingSheets = bBkFldPrnShts;
                target.Sections[index].PageSetup.BookFoldRevPrinting = bBkFldRevPrnt;
            }
        }

更新2

实际上,@R_302_6404@不与段落格式保持一致不是复制保真度的问题,而是从.doc到.docx转换的问题。 (https://support.microsoft.com/en-us/help/923183/the-layout-of-a-document-that-contains-a-page-break-may-be-different-i) 也许有人想到了解决这个问题的方法

解决方法

Paul Edstein(macropod)的以下代码可能会为您提供帮助。至少可以使您了解所面临的复杂性。

' ============================================================================================================
' KEEP NEXT THREE TOGETHER 
' ============================================================================================================
'
Sub CombineDocuments()
' Paul Edstein
' https://www.msofficeforums.com/word-vba/43339-combine-multiple-word-documents.html
'
' Users occasionally need to combine multiple documents that may of may not have the same page layouts,'   Style definitions,and so on. Consequently,combining multiple documents is often rather more complex than
'   simply copying & pasting content from one document to another. Problems arise when the documents have
'   different page layouts,headers,footers,page numbering,bookmarks & cross-references,'   Tables of Contents,Indexes,etc.,and especially when those documents have used the same Style
'   names with different definitions.
'
' The following Word macro (for Windows PCs only) handles the more common issues that arise when combining
'   documents; it does not attempt to resolve conflicts with paragraph auto-numbering,'   document -vs- section page numbering in 'page x of y' numbering schemes,Tables of Contents or Indexing issues.
'   Neither does it attempt to deal with the effects on footnote or endnote numbering & positioning or with the
'   consequences of duplicated bookmarks (only one of which can exist in the merged document) and any corresponding
'   cross-references.
'
' The macro includes a folder browser. Simply select the folder to process and all documents in that folder
'   will be combined into the currently-active document. Word's .doc,.docx,and .docm formats will all be processed,'   even if different formats exist in the selected folder.
'
    Application.ScreenUpdating = False
    Dim strFolder As String,strFile As String,strTgt As String
    Dim wdDocTgt As Document,wdDocSrc As Document,HdFt As HeaderFooter
    strFolder = GetFolder: If strFolder = "" Then Exit Sub
    Set wdDocTgt = ActiveDocument: strTgt = ActiveDocument.fullname
    strFile = Dir(strFolder & "\*.doc",vbNormal)
    While strFile <> ""
      If strFolder & strFile <> strTgt Then
        Set wdDocSrc = Documents.Open(FileName:=strFolder & "\" & strFile,AddToRecentFiles:=False,Visible:=False)
        With wdDocTgt
          .Characters.Last.InsertBefore vbCr
          .Characters.Last.InsertBreak (wdSectionBreakNextPage)
          With .Sections.Last
            For Each HdFt In .Headers
              With HdFt
                .LinkToPrevious = False
                .range.Text = vbNullString
                .PageNumbers.RestartNumberingAtSection = True
                .PageNumbers.StartingNumber = wdDocSrc.Sections.First.Headers(HdFt.Index).PageNumbers.StartingNumber
              End With
            Next
            For Each HdFt In .Footers
              With HdFt
                .LinkToPrevious = False
                .range.Text = vbNullString
                .PageNumbers.RestartNumberingAtSection = True
                .PageNumbers.StartingNumber = wdDocSrc.Sections.First.Headers(HdFt.Index).PageNumbers.StartingNumber
              End With
            Next
          End With
          Call LayoutTransfer(wdDocTgt,wdDocSrc)
          .range.Characters.Last.FormattedText = wdDocSrc.range.FormattedText
          With .Sections.Last
            For Each HdFt In .Headers
              With HdFt
                .range.FormattedText = wdDocSrc.Sections.Last.Headers(.Index).range.FormattedText
                .range.Characters.Last.Delete
              End With
            Next
            For Each HdFt In .Footers
              With HdFt
                .range.FormattedText = wdDocSrc.Sections.Last.Footers(.Index).range.FormattedText
                .range.Characters.Last.Delete
              End With
            Next
          End With
        End With
        wdDocSrc.Close SaveChanges:=False
      End If
      strFile = Dir()
    Wend
    With wdDocTgt
      ' Save & close the combined document
      .SaveAs FileName:=strFolder & "Forms.docx",FileFormat:=wdFormatXMLDocument,AddToRecentFiles:=False
      ' and/or:
      .SaveAs FileName:=strFolder & "Forms.pdf",FileFormat:=wdFormatPDF,AddToRecentFiles:=False
      .Close SaveChanges:=False
    End With
    Set wdDocSrc = Nothing: Set wdDocTgt = Nothing
    Application.ScreenUpdating = True
End Sub
' ============================================================================================================
Private Function GetFolder() As String
' used by CombineDocument macro by Paul Edstein,keep together in same module
' https://www.msofficeforums.com/word-vba/43339-combine-multiple-word-documents.html

    Dim oFolder As Object
    GetFolder = ""
    Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0,"Choose a folder",0)
    If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
    Set oFolder = Nothing
End Function

Sub LayoutTransfer(wdDocTgt As Document,wdDocSrc As Document)
' works with previous Combine Documents macro from Paul Edstein,keep together
' https://www.msofficeforums.com/word-vba/43339-combine-multiple-word-documents.html
'
    Dim sPageHght As Single,sPageWdth As Single
    Dim sHeaderDist As Single,sFooterDist As Single
    Dim sTMargin As Single,sBMargin As Single
    Dim sLMargin As Single,sRMargin As Single
    Dim sGutter As Single,sGutterPos As Single
    Dim lPaperSize As Long,lGutterStyle As Long
    Dim lMirrorMargins As Long,lVerticalAlignment As Long
    Dim lScnStart As Long,lScnDir As Long
    Dim lOddEvenHdFt As Long,lDiffFirstHdFt As Long
    Dim bTwoPagesOnOne As Boolean,bBkFldPrnt As Boolean
    Dim bBkFldPrnShts As Boolean,bBkFldRevPrnt As Boolean
    Dim lOrientation As Long
    With wdDocSrc.Sections.Last.PageSetup
      lPaperSize = .PaperSize
      lGutterStyle = .GutterStyle
      lOrientation = .Orientation
      lMirrorMargins = .MirrorMargins
      lScnStart = .SectionStart
      lScnDir = .SectionDirection
      lOddEvenHdFt = .OddAndEvenPagesHeaderFooter
      lDiffFirstHdFt = .DifferentFirstPageHeaderFooter
      lVerticalAlignment = .VerticalAlignment
      sPageHght = .PageHeight
      sPageWdth = .PageWidth
      sTMargin = .TopMargin
      sBMargin = .BottomMargin
      sLMargin = .LeftMargin
      sRMargin = .RightMargin
      sGutter = .Gutter
      sGutterPos = .GutterPos
      sHeaderDist = .HeaderDistance
      sFooterDist = .FooterDistance
      bTwoPagesOnOne = .TwoPagesOnOne
      bBkFldPrnt = .BookFoldPrinting
      bBkFldPrnShts = .BookFoldPrintingSheets
      bBkFldRevPrnt = .BookFoldRevPrinting
    End With
    With wdDocTgt.Sections.Last.PageSetup
      .GutterStyle = lGutterStyle
      .MirrorMargins = lMirrorMargins
      .SectionStart = lScnStart
      .SectionDirection = lScnDir
      .OddAndEvenPagesHeaderFooter = lOddEvenHdFt
      .DifferentFirstPageHeaderFooter = lDiffFirstHdFt
      .VerticalAlignment = lVerticalAlignment
      .PageHeight = sPageHght
      .PageWidth = sPageWdth
      .TopMargin = sTMargin
      .BottomMargin = sBMargin
      .LeftMargin = sLMargin
      .RightMargin = sRMargin
      .Gutter = sGutter
      .GutterPos = sGutterPos
      .HeaderDistance = sHeaderDist
      .FooterDistance = sFooterDist
      .TwoPagesOnOne = bTwoPagesOnOne
      .BookFoldPrinting = bBkFldPrnt
      .BookFoldPrintingSheets = bBkFldPrnShts
      .BookFoldRevPrinting = bBkFldRevPrnt
      .PaperSize = lPaperSize
      .Orientation = lOrientation
    End With
End Sub
 
' ============================================================================================================
,

我使用了模板,并在编辑后多次将其复制到新的Word文档中。 像这样

Word.Range rng = wordDocTarget.Content;
rng.Collapse(Word.WdCollapseDirection.wdCollapseEnd)
rng.FormattedText = wordDocSource.Content.FormattedText

另一种选择也可以是将整个文件插入范围/文档中

rng = wordDoc.Range
rng.Collapse(Word.WdCollapseDirection.wdCollapseEnd)
rng.InsertFile(filepath)