Lets Learn

Opinion Matters

How to Delete a Column in Excel using Open XML SDK

Posted by Ankush on June 13, 2011


Recently there was a question posted on my blog as how we can delete a column in spreadsheet (Excel) using Open XML SDK. Actually, it’s a very tricky requirement because you need to update so much of information i.e. changing the cell index. So it’s really depends how you want to achieve it. You may want to write a COM Addin/VSTO addin to do the job but then you need to deploy the add-in at client side. So, if you are looking for a server side solution, here is the approach. You may want to optimize it according to your need but I will explain in detail what/why I did.

So, let’s start with the concept. Let’s say have created a sample Excel workbook like this:

clip_image001

If you look the XML representation of the data, it will be like this

<x:sheetData xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">

<x:row r="1" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A1">
<x:v>1</x:v>
</x:c>
<x:c r="B1">
<x:v>2</x:v>
</x:c>
<x:c r="C1">
<x:v>3</x:v>
</x:c>
</x:row>

<x:row r="2" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A2">
<x:v>1</x:v>
</x:c>
<x:c r="B2">
<x:v>2</x:v>
</x:c>
<x:c r="C2">
<x:v>3</x:v>
</x:c>
</x:row>
</x:sheetData>

So pay attention to how rows and cells are declared. So the declaration is:

Row1
Cell1
Cell2
Cell3

Row2
Cell1
Cell2
Cell3

So for ex: if you delete cell2 from all the rows, you need to update the rowIndex (x:r) in the remaining cells. Let’s say you delete column B , the updated XML should look like this:

<x:row r="1" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A1">
<x:v>1</x:v>
</x:c>
<x:c r="B1">
<x:v>3</x:v>
</x:c>
</x:row>

<x:row r="2" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A2">
<x:v>1</x:v>
</x:c>
<x:c r="B2">
<x:v>3</x:v>
</x:c>
</x:row>

And here is the C# code:

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

using DocumentFormat.OpenXml;

using DocumentFormat.OpenXml.Packaging;

using DocumentFormat.OpenXml.Spreadsheet;

using System.Xml.Linq;

 

namespace DeleteSpreadSheetColumn

{

    class Program

    {

        static XDocument columnsXL;

        static void Main(string[] args)

        {

 

            LoadColumnName("");

            string result = DeleteAColumn(@"C:\Users\abhatia\Desktop\Test1.xlsx", "Sheet1", "I");

        }

 

        public static void LoadColumnName(string XMLFile)

        {

            columnsXL = XDocument.Load(@"C:\Users\abhatia\Desktop\ExcelColumn.xml");

 

        }

        // Given a document, a worksheet name, a column name

        // deletes a column from a specified worksheet.

        public static string DeleteAColumn(string docName, string sheetName, string colName)

        {

            // Open the document for editing.

            using (SpreadsheetDocument document = SpreadsheetDocument.Open(docName, true))

            {

                IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().Where(s => s.Name == sheetName);

                if (sheets.Count() == 0)

                {

                    // The specified worksheet does not exist.

                    return sheetName + "doesn't exist";

                }

                string relationshipId = sheets.First().Id.Value;

                WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);

 

 

                // Get the Total Rows

                IEnumerable<Row> rows = worksheetPart.Worksheet.GetFirstChild<SheetData>().Elements<Row>();

                if (rows.Count() == 0)

                {

                    return "Rows doesn't exist";

 

                }

 

                // Loop through the rows and adjust Cell Index

                foreach (Row row in rows)

                {

                    int index = (int)row.RowIndex.Value;

                    IEnumerable<Cell> cells = row.Elements<Cell>();

 

                    IEnumerable<Cell> cellToDelete = cells.Where(c => string.Compare(c.CellReference.Value, colName + index, true) == 0);

                    if (cellToDelete.Count() > 0)

                        cellToDelete.First().Remove();

 

                    int intColumnToBeDel = (int)(from c in columnsXL.Descendants("col") where c.Attribute("name").Value == colName select c.NodesBeforeSelf().Count()).FirstOrDefault() + 1;

 

 

                    foreach (Cell cell in cells)

                    {

 

                        string col = GetColunName(cell.CellReference.Value);

                        int intColumnPres = (int)(from c in columnsXL.Descendants("col") where c.Attribute("name").Value == col select c.NodesBeforeSelf().Count()).FirstOrDefault() + 1;

 

                        if (intColumnPres > intColumnToBeDel)

                        {

 

                            string ColNamefromXML = ((System.Xml.Linq.XElement)(from c in columnsXL.Descendants("col")

                                                                                where c.Attribute("name").Value + index == cell.CellReference.Value

                                                                                select c.PreviousNode).FirstOrDefault()).FirstAttribute.Value;

                            cell.CellReference.Value = ColNamefromXML + index;

 

                        }

                    }

                }

 

                worksheetPart.Worksheet.Save();

                return "Deleted";

 

            }

        }

 

 

 

        public static bool isNumeric(string val, System.Globalization.NumberStyles NumberStyle)

        {

            Double result;

            return Double.TryParse(val, NumberStyle,

                System.Globalization.CultureInfo.CurrentCulture, out result);

        }

 

        private static string GetColunName(string p)

        {

            Boolean breakLoop = true;

            while (breakLoop)

            {

                if (isNumeric(p.Last().ToString(), System.Globalization.NumberStyles.Integer))

                    p = p.Substring(0, p.Length - 1);

                else

                    breakLoop = false;

            }

            return p;

        }

    }

}

 

Please note that ExcelColumn.xml is just an XML file which contains all the column name. I find it an easy way to get the column name. You can download the complete project and XML file from http://cid-956eb159f7975780.skydrive.live.com/redir.aspx?resid=956EB159F7975780!197

Feel free to optimize it according to your need and let me know if you found any problem with this one!!!

3 Responses to “How to Delete a Column in Excel using Open XML SDK”

  1. Kishor said

    Hi,

    Your article is of great help. Thanks for posting. I need a small thing here.
    When i try to get the list of all columns in the excel file using the statement Column column = workbook.Descendants().First(); i am getting a null value.

    Can we resolve this ?
    Actually i need to set custom column width and custom row heights.

    Thanks,
    YKK Reddy

    • Ankush said

      Please download the project I attached to the blog. I created an XML file to get the columns.Otherwise sheetData section will only contains the row which will have some data not all.

      Hope this helps..

      Ankush

  2. Hi,
    I am getting exception here at
    WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);

    Please help

Leave a comment