How to Delete a Column in Excel using Open XML SDK
Posted by Ankush on June 13, 2011
Recently there was a question posted on my blog as how we can delete a column in spreadsheet (Excel) using Open XML SDK. Actually, it’s a very tricky requirement because you need to update so much of information i.e. changing the cell index. So it’s really depends how you want to achieve it. You may want to write a COM Addin/VSTO addin to do the job but then you need to deploy the add-in at client side. So, if you are looking for a server side solution, here is the approach. You may want to optimize it according to your need but I will explain in detail what/why I did.
So, let’s start with the concept. Let’s say have created a sample Excel workbook like this:
If you look the XML representation of the data, it will be like this
<x:sheetData xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<x:row r="1" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A1">
<x:v>1</x:v>
</x:c>
<x:c r="B1">
<x:v>2</x:v>
</x:c>
<x:c r="C1">
<x:v>3</x:v>
</x:c>
</x:row>
<x:row r="2" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A2">
<x:v>1</x:v>
</x:c>
<x:c r="B2">
<x:v>2</x:v>
</x:c>
<x:c r="C2">
<x:v>3</x:v>
</x:c>
</x:row>
</x:sheetData>
So pay attention to how rows and cells are declared. So the declaration is:
Row1
Cell1
Cell2
Cell3
Row2
Cell1
Cell2
Cell3
So for ex: if you delete cell2 from all the rows, you need to update the rowIndex (x:r) in the remaining cells. Let’s say you delete column B , the updated XML should look like this:
<x:row r="1" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A1">
<x:v>1</x:v>
</x:c>
<x:c r="B1">
<x:v>3</x:v>
</x:c>
</x:row>
<x:row r="2" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A2">
<x:v>1</x:v>
</x:c>
<x:c r="B2">
<x:v>3</x:v>
</x:c>
</x:row>
And here is the C# code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using System.Xml.Linq;
namespace DeleteSpreadSheetColumn
{
class Program
{
static XDocument columnsXL;
static void Main(string[] args)
{
LoadColumnName("");
string result = DeleteAColumn(@"C:\Users\abhatia\Desktop\Test1.xlsx", "Sheet1", "I");
}
public static void LoadColumnName(string XMLFile)
{
columnsXL = XDocument.Load(@"C:\Users\abhatia\Desktop\ExcelColumn.xml");
}
// Given a document, a worksheet name, a column name
// deletes a column from a specified worksheet.
public static string DeleteAColumn(string docName, string sheetName, string colName)
{
// Open the document for editing.
using (SpreadsheetDocument document = SpreadsheetDocument.Open(docName, true))
{
IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().Where(s => s.Name == sheetName);
if (sheets.Count() == 0)
{
// The specified worksheet does not exist.
return sheetName + "doesn't exist";
}
string relationshipId = sheets.First().Id.Value;
WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);
// Get the Total Rows
IEnumerable<Row> rows = worksheetPart.Worksheet.GetFirstChild<SheetData>().Elements<Row>();
if (rows.Count() == 0)
{
return "Rows doesn't exist";
}
// Loop through the rows and adjust Cell Index
foreach (Row row in rows)
{
int index = (int)row.RowIndex.Value;
IEnumerable<Cell> cells = row.Elements<Cell>();
IEnumerable<Cell> cellToDelete = cells.Where(c => string.Compare(c.CellReference.Value, colName + index, true) == 0);
if (cellToDelete.Count() > 0)
cellToDelete.First().Remove();
int intColumnToBeDel = (int)(from c in columnsXL.Descendants("col") where c.Attribute("name").Value == colName select c.NodesBeforeSelf().Count()).FirstOrDefault() + 1;
foreach (Cell cell in cells)
{
string col = GetColunName(cell.CellReference.Value);
int intColumnPres = (int)(from c in columnsXL.Descendants("col") where c.Attribute("name").Value == col select c.NodesBeforeSelf().Count()).FirstOrDefault() + 1;
if (intColumnPres > intColumnToBeDel)
{
string ColNamefromXML = ((System.Xml.Linq.XElement)(from c in columnsXL.Descendants("col")
where c.Attribute("name").Value + index == cell.CellReference.Value
select c.PreviousNode).FirstOrDefault()).FirstAttribute.Value;
cell.CellReference.Value = ColNamefromXML + index;
}
}
}
worksheetPart.Worksheet.Save();
return "Deleted";
}
}
public static bool isNumeric(string val, System.Globalization.NumberStyles NumberStyle)
{
Double result;
return Double.TryParse(val, NumberStyle,
System.Globalization.CultureInfo.CurrentCulture, out result);
}
private static string GetColunName(string p)
{
Boolean breakLoop = true;
while (breakLoop)
{
if (isNumeric(p.Last().ToString(), System.Globalization.NumberStyles.Integer))
p = p.Substring(0, p.Length - 1);
else
breakLoop = false;
}
return p;
}
}
}
Please note that ExcelColumn.xml is just an XML file which contains all the column name. I find it an easy way to get the column name. You can download the complete project and XML file from http://cid-956eb159f7975780.skydrive.live.com/redir.aspx?resid=956EB159F7975780!197
Feel free to optimize it according to your need and let me know if you found any problem with this one!!!
Kishor said
Hi,
Your article is of great help. Thanks for posting. I need a small thing here.
When i try to get the list of all columns in the excel file using the statement Column column = workbook.Descendants().First(); i am getting a null value.
Can we resolve this ?
Actually i need to set custom column width and custom row heights.
Thanks,
YKK Reddy
Ankush said
Please download the project I attached to the blog. I created an XML file to get the columns.Otherwise sheetData section will only contains the row which will have some data not all.
Hope this helps..
Ankush
Rashmi Salokhe said
Hi,
I am getting exception here at
WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);
Please help