Thursday, December 12, 2019

Editing Elements In An OpenXML File Using VBA

Business Download

Editing elements in an OpenXML file using VBA

Wouldn’t it be useful to be able to edit the contents of an Office 2007 OpenXML file from within VBA? Well, now you can.

Download

I have made the file used in this article available for download:

EditOpenXML.zip

Doug Glancy, a colleague of mine, has developed

a tool to edit the RibbonX code in your file
without ever leaving
Excel.

Content of the VBA project

The VBA project contains two modules and one class module:

Content of the VBA project

Content of the VBA project as shown in the project explorer in the VBE

modDemo contains the code that demonstrates the use of the class module clsEditOpenXML. modUNC contains code to work with UNC paths.

The project also uses the Microsoft XML library to ease working with the XML, as shown in the following screenshot of the references:

References of the VBAProject

References of the VBAProject

Also, a couple of other libraries are put to use: the File Scripting Object and the Windows Shell application object, both through late binding as shown later on.

Class module to work with OpenXML files

I have derived a class module that is able to perform the following tasks:

  • Unzip an .xlsx or .xlsm file
  • Extract any XML file from the folderstructure
  • Write back any XML file back into the folder structure
  • Zip the file back.

Unzip an .xlsx or .xlsm file

In order to be able to work with the files contained within the OpenXML zipped file structure, the first step that is needed is to unzip the content of the file.

For safety reasons a backup copy of the file is made first. Then the file is renamed by appending .zip after the name. Next, the .zip file is unzipped to a folder.

The code below (taken from class module clsEditOpenXML) shows how this is done:

Public Sub UnzipFile()
‘————————————————————————-
‘ Procedure : UnzipFile
‘ Company   : JKP Application Development Services (c)
‘ Author    : Jan Karel Pieterse
‘ Created   : 6-5-2009
‘ Purpose   : Unzips all files in a zip file to a designated folder
‘————————————————————————-
    ‘Courtesy www.rondebruin.nl
    Dim FSO As Object
    Dim oShellApp As Object
    Set FSO = CreateObject(“scripting.filesystemobject”)
       
    ‘Derive the folder to unzip to from the location of the sourcefile
    UnzipFolder = FolderName
   
    ‘A dedicated unzip folder will be created in the same folder as the sourcefile,
    ‘called ..Unzipped Filename
    If Right(UnzipFolder, 1) “” Then

        UnzipFolder = UnzipFolder & “UnZipped ” & FileName & “”
    Else

        UnzipFolder = UnzipFolder & “UnZipped ” & FileName & “”
    End If
    On Error Resume Next
    ‘Remove all previous existing folders

    FSO.deletefolder UnzipFolder & “*”, True

    Kill UnzipFolder & “*.*”
    On Error GoTo 0
    ‘Create normal folder
    If FolderExists(UnzipFolder) = False Then
        MkDir UnzipFolder
    End If

    Set oShellApp = CreateObject(“Shell.Application”)

    ‘Copy the files in the newly created folder

    oShellApp.Namespace(UnzipFolder).CopyHere oShellApp.Namespace(SourceFile).items

    On Error Resume Next

    ‘Clean up temp folder

    FSO.deletefolder Environ(“Temp”) & “Temporary Directory*”, True

   

    ‘Inside the now unzipped folder structure all relevant files are

    ‘located here:

    XLFolder = UnzipFolder & “xl”

    Set oShellApp = Nothing

    Set FSO = Nothing

    Exit Sub

End Sub

Extract any XML file from the folderstructure

This little routine extracts an xml file from the unzipped folders and returns the XML contained in that file:


Public Function GetXMLFromFile(sFileName As String) As String

‘————————————————————————-

‘ Procedure : GetXMLFromFile

‘ Company   : JKP Application Development Services (c)

‘ Author    : Jan Karel Pieterse

‘ Created   : 6-5-2009

‘ Purpose   : Gets the XML code from the foldernamefilename

‘————————————————————————-

    Dim oXMLDoc As MSXML2.DOMDocument

    If Len(XLFolder) = 0 Then

        GetXMLFromFile = “”

    Else

        Set oXMLDoc = New MSXML2.DOMDocument

        oXMLDoc.Load XLFolder & sFileName

        GetXMLFromFile = oXMLDoc.XML

        Set oXMLDoc = Nothing

    End If

End Function

Write back any XML file back into the folder structure

The opposite direction is equally straightforward:


Public Sub WriteXML2File(sXML As String, sFileName As String)

‘————————————————————————-

‘ Procedure : WriteXML2File

‘ Company   : JKP Application Development Services (c)

‘ Author    : Jan Karel Pieterse

‘ Created   : 6-5-2009

‘ Purpose   : Writes sXML to sFileName

‘————————————————————————-

    Dim oXMLDoc As MSXML2.DOMDocument

    Set oXMLDoc = New MSXML2.DOMDocument

    oXMLDoc.loadXML sXML

    oXMLDoc.Save XLFolder & sFileName

End Sub

Zip the file back.

After we’re done editing the xml contents of the unzipped OpenXML package we need to rezip the folders again. The code below does exactly that:

Public Sub ZipAllFilesInFolder()
‘————————————————————————-
‘ Procedure : ZipAllFilesInFolder
‘ Company   : JKP Application Development Services (c)
‘ Author    : Jan Karel Pieterse
‘ Created   : 6-5-2009
‘ Purpose   : Zips all files in a folder (including subfolders) whilst retaining the folder structure
‘————————————————————————-
    ‘Courtesy www.rondebruin.nl
    Dim oShellApp As Object
    Dim sDate As String
    Dim sDefPath As String
    Dim vFileNameZip As Variant
    Dim FSO As Object
    Dim lFileCt As Long
    Set FSO = CreateObject(“scripting.filesystemobject”)
   
    ‘To ensure a unique filename,
    ‘append date and time to the name of the current file
   
    sDate = Format(Now, ” dd-mmm-yy h-mm-ss”)

    vFileNameZip = SourceFile & sDate & “.zip”
   
    ‘Create empty Zip File
    NewZip vFileNameZip

    Set oShellApp = CreateObject(“Shell.Application”)

   

    ‘Count how many items are in the “old” folder

    lFileCt = oShellApp.Namespace(FolderName & “Unzipped ” & FileName & “”).items.Count

   

    ‘Copy the files to the compressed folder

    oShellApp.Namespace(vFileNameZip).CopyHere oShellApp.Namespace(FolderName & “Unzipped ” & FileName & “”).items

    ‘Keep script waiting until we have same # of files in the new folder

    On Error Resume Next

    Do Until oShellApp.Namespace(vFileNameZip).items.Count = lFileCt

        Application.Wait (Now + TimeValue(“0:00:01”))

    Loop

    DoEvents

   

    ‘Remove original file

    Kill SourceFile

   

    ‘Rename new zipped file to same name as original file (with .zip appended)

    Name vFileNameZip As SourceFile

    On Error Resume Next

   

    ‘Now remove old folder, just in case something went haywire

    FSO.deletefolder FolderName & “Unzipped ” & FileName, True

    On Error GoTo 0

    Set oShellApp = Nothing

End Sub

Getting the xml file belonging to a specific sheet

One of the most basic operations when working with the OpenXML package would be changing the XML of a worksheet in the file. Of course we’d want to extract the proper XML from the package based on the information mere mortals know: the sheet’s name. The code shown below extracts the proper rId (relationship identifier) attribute value as it is listed within the file workbook.xml within the sheets node of that file.


Private Function GetSheetIdFromSheetName(sSheetName) As String

‘————————————————————————-

‘ Procedure : GetSheetIdFromSheetName

‘ Company   : JKP Application Development Services (c)

‘ Author    : Jan Karel Pieterse

‘ Created   : 6-5-2009

‘ Purpose   : Finds out what the SheetId of sSheetname is

‘             by reading Workbook.xml

‘————————————————————————-

    Dim oXMLDoc As MSXML2.DOMDocument

    Dim oXMLNode As MSXML2.IXMLDOMNode

    Dim oXMLNodeList As MSXML2.IXMLDOMNodeList

    If mvXLFolder <> “” And Sheet2Change <> “” Then

        Set oXMLDoc = New MSXML2.DOMDocument

        oXMLDoc.Load XLFolder & “workbook.xml”

        Set oXMLNodeList = oXMLDoc.SelectNodes(“/workbook/sheets/sheet”)

        For Each oXMLNode In oXMLNodeList

            If oXMLNode.Attributes.getNamedItem(“name”).nodeValue = sSheetName Then

                GetSheetIdFromSheetName = oXMLNode.Attributes.getNamedItem(“r:id”).nodeValue

                Exit Function

            End If

        Next

    End If

End Function

The next routine then finds out which xml file belongs to that Id:


Public Function GetSheetFileNameFromId(sSheetId As String) As String

‘————————————————————————-

‘ Procedure : GetSheetFileNameFromId

‘ Company   : JKP Application Development Services (c)

‘ Author    : Jan Karel Pieterse

‘ Created   : 6-5-2009

‘ Purpose   : Fetches the name of the xml file belonging to the sheet with id SheetId.

‘————————————————————————-

    Dim oXMLDoc As MSXML2.DOMDocument

    Dim oXMLNode As MSXML2.IXMLDOMNode

    Dim oXMLNodeList As MSXML2.IXMLDOMNodeList

    If mvXLFolder <> “” And Sheet2Change <> “” Then

        Set oXMLDoc = New MSXML2.DOMDocument

        oXMLDoc.Load XLFolder & “_relsworkbook.xml.rels”

        Set oXMLNodeList = oXMLDoc.SelectNodes(“/Relationships/Relationship”)

        For Each oXMLNode In oXMLNodeList

            If oXMLNode.Attributes.getNamedItem(“Id”).nodeValue = sSheetId Then

                GetSheetFileNameFromId = oXMLNode.Attributes.getNamedItem(“Target”).nodeValue

                Exit Function

            End If

        Next

    End If

End Function

Of course it might be useful to be able to go the other way: Get a sheetname belonging to a specific Id:


Public Function GetSheetNameFromId(sId As String) As String

‘————————————————————————-

‘ Procedure : GetSheetNameFromId

‘ Company   : JKP Application Development Services (c)

‘ Author    : Jan Karel Pieterse

‘ Created   : 6-5-2009

‘ Purpose   : Returns the sheetname belonging to a sheetId

‘————————————————————————-

    Dim oXMLDoc As MSXML2.DOMDocument

    Dim oXMLNode As MSXML2.IXMLDOMNode

    Dim oXMLNodeList As MSXML2.IXMLDOMNodeList

    If mvXLFolder <> “” Then

        Set oXMLDoc = New MSXML2.DOMDocument

        oXMLDoc.Load XLFolder & “workbook.xml”

        Set oXMLNodeList = oXMLDoc.SelectNodes(“/workbook/sheets/sheet”)

        For Each oXMLNode In oXMLNodeList

            If oXMLNode.Attributes.getNamedItem(“r:id”).nodeValue = “rId” & Val(sId) + 1 Then

                GetSheetNameFromId = oXMLNode.Attributes.getNamedItem(“name”).nodeValue

                ‘Got it, get out

                Exit Function

            End If

        Next

    End If

End Function

How to use the class module

In a normal module (called modDemo) I have demonstrated how the class module may be put to use:


Public Sub Demo()

‘————————————————————————-

‘ Procedure : Demo

‘ Company   : JKP Application Development Services (c)

‘ Author    : Jan Karel Pieterse (jkp-ads.com)

‘ Created   : 06-05-2009

‘ Purpose   : Demonstrates getting something from an OpemXML file

‘————————————————————————-

    Dim cEditOpenXML As clsEditOpenXML

    Dim sXML As String

   

    Set cEditOpenXML = New clsEditOpenXML

   

    With cEditOpenXML

        ‘Tell it which OpenXML file to process

        .SourceFile = ThisWorkbook.Path & “formcontrols.xlsm”

       

        ‘Before you can access info in the file, it must be unzipped

        .UnzipFile

       

        ‘Tell it which sheet you want to change

        .Sheet2Change = “MySheet”

       

        ‘Get XML from the sheet’s xml file

        sXML = .GetXMLFromFile(.SheetFileName)

       

        ‘Change the xml of the sheet here

       

        ‘Now write the xml back to the sheet:

        ‘.WriteXML2File sXML, .SheetFileName

       

        ‘Now rezip the unzipped package

        .ZipAllFilesInFolder

    End With

   

    ‘Only when you let the class go out of scope the zip file’s .zip extension is removed

    ‘in the terminate event of the class.

    ‘Then the OpenXML file has its original filename back.

    Set cEditOpenXML = Nothing

End Sub

Conclusion

The code shown in this article and in the associated download file shows you a way to extract content from the Office 2007 OpenXML file format and even enables you to edit parts of the content.

Of course it is up to you to keep everything nice and tidy and adhere to the rules of the OpenXML file format specification. See this article for more information about the format.

 


No comments:

Post a Comment