Lets Learn

Opinion Matters

How to Insert a Hyperlink in Excel using Open XML SDK 2.0

Posted by Ankush on March 4, 2011


Consider a scenario, you want to insert a hyperlink in Excel 2010 (or Excel 2007) file programmatically using a web application. You should not use Office Object Model in a web application, Please refer http://support.microsoft.com/kb/257757. So what we can do??

As we know that Office 2007 and 2010 file format is XML based (just rename any Office 2007/2010 document to .zip and you can see a lots of XML files which makes a document and they are simply zipped in a one file). To manipulate these files, MS has introduced a library, Open XML SDK 2.0. The SDK is built on the System.IO.Packaging API and provides strongly-typed classes to manipulate documents that adhere to the Office Open XML File Formats Specification. The Office Open XML File Formats specification is an open, international, ECMA-376, Second Edition and ISO/IEC 29500 standard. The Open XML file formats are useful for developers because they are an open standard and are based on well-known technologies: ZIP and XML.

The Open XML SDK 2.0 simplifies the task of manipulating Open XML packages and the underlying Open XML schema elements within a package. The Open XML SDK 2.0 encapsulates many common tasks that developers perform on Open XML packages, so that you can perform complex operations with just a few lines of code. Some benefits if you use Open XML SDK:

Open Format: These formats use XML and ZIP, and they are fully documented. Anyone can get the full specs on the formats and it is a royalty free license for anyone that wants to work with the files.

Compressed: Files saved in these new XML formats are less than 50% the size of the equivalent file saved in the binary formats. This is because we take all of the XML parts that make up any given file, and then we ZIP them. We chose ZIP because it’s already widely in use today and we wanted these files to be easy to work with.

Robust: Between the usage of XML, ZIP, and good documentation the files get a lot more robust. By compartmentalizing our files into multiple parts within the ZIP, it becomes a lot less likely that an entire file will be corrupted (instead of just individual parts). The files are also a lot easier to work with, so it’s less likely that people working on the files outside of Office will cause corruptions.

Backward compatible: You don’t have to use the new version of Office to take advantage of these formats. You can use Office Compatibility pack to open the Office 2007 document in older version.

Binary Format support: You can still use the current binary formats with the new version of Office. In fact, people can easily change to use the binary formats as the default if that’s what they’d rather do.

New Extensions: The new formats has new extensions (.docx, .pptx, .xlsx) so you can tell what format the files you are dealing with are, but to the average end user they’ll still just behave like any other Office file. Double click & it opens in the right application.

Is also important to know that what SDK CAN’T do:

The Open XML SDK 2.0:

  • Does not replace the Microsoft Office Object Model and provides no abstraction on top of the file formats. You must still understand the structure of the file formats to use the Open XML SDK 2.0.
  • Does not provide functionality to convert Open XML formats to and from other formats, such as HTML or XPS.
  • Does not guarantee document validity of Open XML Formats when you use the Open XML SDK 2.0 or if you decide to manipulate the underlying XML directly.
  • Does not provide application behavior such as layout functionality in Word or recalculation, data refresh, or adjustment functionalities in Excel.

So lets have a look at the solution. I will try to build the solution from the scratch so that you can learn how you can use Open XML tools and leverage this information in other exercise as well.

So lets start!!

Step 1: Download the Open XML SDK and the tool from http://www.microsoft.com/downloads/en/details.aspx?FamilyId=C6E744E5-36E9-45F5-8D8C-331DF206E0D0&displaylang=en. Please note that there are 2 links, one is for the tool and the second one is the actual library.

Step 2: So the objective is to insert a link into a cell. For the simplicity, we will assume that we want to insert the link into cell A1 of Sheet1. Lets create an Excel 2010 workbook and place a hyperlink in cell A1 manually. Once done, Excel file will look like this

clip_image001

Save this file and close it.

Step 3: If you haven’t changed the default installation location of the tool, they will appear here: C:\Program Files (x86)\Open XML SDK\V2.0\tool. Go ahead and run OpenXmlSdkTool.exe from here.

Step 4: You can simply open Office file in this tool and look at the different XML parts. (To understand different XML parts, have a look:https://ankushbhatia.wordpress.com/2010/03/04/basics-of-spreadsheet/). So go ahead and open the file in this tool. Once done, it will look like this:

clip_image002

Step 5: We know that we inserted the hyperlink in cell A1 of Sheet1 so go ahead and browse the Sheet1.xml file. Once you select the Sheet1, it gives you different child elements of it. Please select hyperlinks and click on “reflect Code” to view the XML. Make sure to follow the correct tree structure to get to Sheet1.xml and hyperlink object. After this, it will appear as:

clip_image003

Pay attention to the right side. There are 2 windows. Upper window shows XML of the section you have selected in the “Document Explorer” and the bottom window shows the C# code to generate the XML..Kind of cool….isn’t?? So now what we have..? We have XML which is required to generate a hyperlink and the equivalent C# code. Lets have a closure look at the XML

<x:hyperlinks xmlns:x=”http://schemas.openxmlformats.org/spreadsheetml/2006/main”&gt;

<x:hyperlink ref=”A1″ r:id=”rId1″ xmlns:r=”http://schemas.openxmlformats.org/officeDocument/2006/relationships&#8221; />

</x:hyperlinks>

There is no information about the hyperlink text. If you look at the relationship ID “A1″, it appears as

<?xml version=”1.0″ encoding=”UTF-8″ standalone=”true”?>

<Relationships xmlns=”http://schemas.openxmlformats.org/package/2006/relationships“><Relationship TargetMode=”External” Target=”http://www.microsoft.com/” Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink” Id=”rId1″/></Relationships>

So where is the actual text (remember I entered Microsoft)?

As you know that Excel keeps all the text strings in a separate XML file, SharedSTring.xml and refer it back based on the index. So lets see how the data appear in cell A1

Have a look at the SheetData [where it keeps all the data . It is specific to a sheet] in the tool

clip_image004

If you look at the XML, it contains the information about the row which contains the data.

<x:sheetData xmlns:x=”http://schemas.openxmlformats.org/spreadsheetml/2006/main”&gt;

<x:row r=”1″ spans=”1:1″ x14ac:dyDescent=”0.25″ xmlns:x14ac=”http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac”&gt;

<x:c r=”A1″ s=”1″ t=”s”>

<x:v>0</x:v>

</x:c>

</x:row>

</x:sheetData>

Lets try to decode it.

<x:row r=”1″ spans=”1:1″ x14ac:dyDescent=”0.25″ xmlns:x14ac=”http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac”&gt; :

r=1 means there is one row.

There is one child element <x:c r=”A1″ s=”1″ t=”s”>.

X:c -> Column
r=A1 -> Referenced cell is “A1″

S=1 and t=”S” it’s a string value and value is referenced from Shared String table. And S=1 means that a style is applied and the defination of the style is taken from index 1 which can be found from Style.xml

Now analyze the SharedSTring.XML using the same tool

clip_image005

Here you will see the Microsoft string. It is referred back in SheetData using x:v node. The starting index is 0.

So to summarize, we need to do:

  1. Create a hyperlink object and insert it before the PageMargin node. Please note that Excel is enforcing a strict ordering of the elements. So make sure to add it before PageMargines node.
  2. Create an entry in the SharedString Table
  3. Create an entry in the SheetData section

So why we did this exercise. Every now and then we face situation where we want to these kind of things in Office using Open XML. So easy thing is, do the stuff in UI and analyze the XML and C# code and then write your own.

Lets move on and utilize this information.

Step 6: Create a new Excel file and save it. We will insert a hyperlink in this file. Create a new VS 2010 project and open the new blank file.

Step 7: Add a reference to DocumentFormat.OpenXML and WindowsBase.dll from .NET tab

Step 8: Add a namespace using XL = DocumentFormat.OpenXml.Spreadsheet;

Step 9: So basically you need to

a. Open the workbook
b. Get to the sheet where you want to add the hyperlink
c. Add the hyperlink
d. Add an entry in SheetData
e. Add an entry in SharedString.xml

The below code does exactly that.

using (SpreadsheetDocument document = SpreadsheetDocument.Open(@"C:\Users\abhatia\Desktop\Sample1.xlsx", true))

           {

               IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().Where(s => s.Name == "Sheet1");

               if (sheets.Count() == 0)

               {

                   // The specified worksheet does not exist.

                   return;

               }

               string relationshipId = sheets.First().Id.Value;

               WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);


               Hyperlinks hyperlinks1 = new Hyperlinks();

               Hyperlink hyperlink1 = new Hyperlink() { Reference = "A1", Id = "UNIQUE" };

               hyperlinks1.Append(hyperlink1);


               PageMargins pageMargins = worksheetPart.Worksheet.Descendants<PageMargins>().First();

               worksheetPart.Worksheet.InsertBefore<Hyperlinks>(hyperlinks1, pageMargins);

               worksheetPart.Worksheet.Save();

               worksheetPart.AddHyperlinkRelationship(new System.Uri("http://www.google.com/", System.UriKind.Absolute), true, "UNIQUE");


               // Create an element in SheetData


               // Get the SharedStringTablePart. If it does not exist, create a new one.

               SharedStringTablePart shareStringPart;

               if (document.WorkbookPart.GetPartsOfType<SharedStringTablePart>().Count() > 0)

               {

                   shareStringPart = document.WorkbookPart.GetPartsOfType<SharedStringTablePart>().First();

               }

               else

               {

                   shareStringPart = document.WorkbookPart.AddNewPart<SharedStringTablePart>();

               }



               // Insert the text into the SharedStringTablePart.

               int index = InsertSharedStringItem("www.google.com", shareStringPart);


               // Insert cell A1 into the new worksheet.

               Cell cell = InsertCellInWorksheet("A", 1, worksheetPart);


               // Set the value of cell A1.

               cell.CellValue = new CellValue(index.ToString());

               cell.DataType = new EnumValue<CellValues>(CellValues.SharedString);


               // Save the new worksheet.

               worksheetPart.Worksheet.Save();



           }

 

Here are the helper functions (taken from MS site):

// Given text and a SharedStringTablePart, creates a SharedStringItem with the specified text 

// and inserts it into the SharedStringTablePart. If the item already exists, returns its index.

private static int InsertSharedStringItem(string text, SharedStringTablePart shareStringPart)

{

    // If the part does not contain a SharedStringTable, create one.

    if (shareStringPart.SharedStringTable == null)

    {

        shareStringPart.SharedStringTable = new SharedStringTable();

    }


    int i = 0;


    // Iterate through all the items in the SharedStringTable. If the text already exists, return its index.

    foreach (SharedStringItem item in shareStringPart.SharedStringTable.Elements<SharedStringItem>())

    {

        if (item.InnerText == text)

        {

            return i;

        }


        i++;

    }


    // The text does not exist in the part. Create the SharedStringItem and return its index.

    shareStringPart.SharedStringTable.AppendChild(new SharedStringItem(new DocumentFormat.OpenXml.Spreadsheet.Text(text)));

    shareStringPart.SharedStringTable.Save();


    return i;

}

 

// Given a column name, a row index, and a WorksheetPart, inserts a cell into the worksheet. 

       // If the cell already exists, returns it. 

       private static Cell InsertCellInWorksheet(string columnName, uint rowIndex, WorksheetPart worksheetPart)

       {

           Worksheet worksheet = worksheetPart.Worksheet;

           SheetData sheetData = worksheet.GetFirstChild<SheetData>();

           string cellReference = columnName + rowIndex;


           // If the worksheet does not contain a row with the specified row index, insert one.

           Row row;

           if (sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).Count() != 0)

           {

               row = sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).First();

           }

           else

           {

               row = new Row() { RowIndex = rowIndex };

               sheetData.Append(row);

           }


           // If there is not a cell with the specified column name, insert one.  

           if (row.Elements<Cell>().Where(c => c.CellReference.Value == columnName + rowIndex).Count() > 0)

           {

               return row.Elements<Cell>().Where(c => c.CellReference.Value == cellReference).First();

           }

           else

           {

               // Cells must be in sequential order according to CellReference. Determine where to insert the new cell.

               Cell refCell = null;

               foreach (Cell cell in row.Elements<Cell>())

               {

                   if (string.Compare(cell.CellReference.Value, cellReference, true) > 0)

                   {

                       refCell = cell;

                       break;

                   }

               }


               Cell newCell = new Cell() { CellReference = cellReference };

               row.InsertBefore(newCell, refCell);


               worksheet.Save();

               return newCell;

           }

       }

 

Assumptions:

I have made several assumptions while writing code for this.

  1. I have assumed that Excel file is a new blank file i.e. there is no hyperlink present.
  2. Sheet1 name is hardcoded.
  3. File Name is Hardcoded.
  4. Can you think of anything by looking at the code? 🙂

Run this code and everything will work. When you will open the Excel sheet and click on the link, it will be treated as hyperlink.

But still one thing is missing. It is not underlined and appear in the blue color. Because you need to apply a style to do that. I leave this exercise to my readers to check how the styles are applied and how they can use this information. In case of anything, let me know 🙂

Please try this approch in every situation where you use Open XML and let me know in case of any questions.

Advertisements

9 Responses to “How to Insert a Hyperlink in Excel using Open XML SDK 2.0”

  1. […] How to Insert a Hyperlink in Excel using Open XML SDK 2.0 … Tags: Documents, Excel, Password, Protected, Recover Posted in Office 2010 […]

  2. vibha mehta said

    Thanks Ankush.. This really helped me.
    I have also applied style(Blue col0r n underlined) to that cell to show like a hyperlink.
    But problem is I want to upload the same file to server and am getting Oledb exception i.e “External table is not in the expected format.” while opening connection to this file.
    At line –> connection.open(). So if i just press F2 on this cell and save then it get uploaded.But this is not feasible in my requirement.
    Appreciate if you can help in this regard.I need to resolve this issue urgently.

    Thanks in advance !!

    • Ankush said

      Hi Vibha,

      If EXcel opens the file with out any error , File is good.

      How you are uploading the file. Let me know the code and the objective and I will help you with it.

      Thanks,
      Ankush

  3. vibha mehta said

    Thanks for your reply.
    Objective is to get data from a table and save it into Excel file and table has one column which need to be hyperlink in file. Am able to do this.Now I need to upload the same file without changing its content. So tried to get the Excel data into dataset.
    Here is code snippet , I tried following connection string:
    connectionString = “Provider=Microsoft.Ace.OLEDB.12.0;Data Source={2};Extended Properties=\”Excel 12.0;HDR=YES\””;

    After that, to get data into dataset , write:

    if (this.Connection.State != ConnectionState.Open)
    {
    this.Connection.Open();
    onReadProgress(10);

    }
    string cmdText = “Select * from [{0}]”;
    And I got Oledb Exception at Connection.Open();
    Error is : External table is not in the expected format.

    Please let me if this much code is not enough to find cause of problem.

    Thanks.

  4. Ankush said

    Hi,

    Can you please pass me the complete code? Also let me know if it is a 64 bit machine.

    Ankush

  5. Ankush said

    you can email me the code @ ankush.bhatia@live.com

  6. John said

    Thanks Ankush, this has been a real big help.

    But I found that if run a second time on the same speadsheet I would get 2 hyperlink objects in the worksheet, each with a hyperlink, instead of the 2 hyperlinks in the same hyperlink object (I hope that made sense), and this would cause the file to fail to open in Excel 2010.

    I have modified the hyperlink code in your example to:

    WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);

    Hyperlink hyperlink1 = new Hyperlink() { Reference = cellRef, Id = strID };

    if (worksheetPart.Worksheet.Descendants().Count() == 0)
    {
    Hyperlinks hyperlinks1 = new Hyperlinks();
    hyperlinks1.Append(hyperlink1);
    PageMargins pageMargins = worksheetPart.Worksheet.Descendants().First();
    worksheetPart.Worksheet.InsertBefore(hyperlinks1, pageMargins);
    }
    else
    {
    Hyperlinks hyperlinks1 = worksheetPart.Worksheet.Descendants().First();
    hyperlinks1.Append(hyperlink1);
    }

    worksheetPart.Worksheet.Save();
    worksheetPart.AddHyperlinkRelationship(new System.Uri(uri, System.UriKind.Relative), true, strID);

    It’s probly not the prettiest solution but it seems to work, so long as there is a PageMargin object.

    Please let me know if I have got any of this wrong.

    Thanks Again

    John

  7. Sandeep said

    Your article helped a lot. How can I make it work for adding the hyperlink to entire column?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: