Lets Learn

Opinion Matters

Posts Tagged ‘Open XML SDK’

How to Delete a Column in Excel using Open XML SDK

Posted by Ankush on June 13, 2011

Recently there was a question posted on my blog as how we can delete a column in spreadsheet (Excel) using Open XML SDK. Actually, it’s a very tricky requirement because you need to update so much of information i.e. changing the cell index. So it’s really depends how you want to achieve it. You may want to write a COM Addin/VSTO addin to do the job but then you need to deploy the add-in at client side. So, if you are looking for a server side solution, here is the approach. You may want to optimize it according to your need but I will explain in detail what/why I did.

So, let’s start with the concept. Let’s say have created a sample Excel workbook like this:

clip_image001

If you look the XML representation of the data, it will be like this

<x:sheetData xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">

<x:row r="1" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A1">
<x:v>1</x:v>
</x:c>
<x:c r="B1">
<x:v>2</x:v>
</x:c>
<x:c r="C1">
<x:v>3</x:v>
</x:c>
</x:row>

<x:row r="2" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A2">
<x:v>1</x:v>
</x:c>
<x:c r="B2">
<x:v>2</x:v>
</x:c>
<x:c r="C2">
<x:v>3</x:v>
</x:c>
</x:row>
</x:sheetData>

So pay attention to how rows and cells are declared. So the declaration is:

Row1
Cell1
Cell2
Cell3

Row2
Cell1
Cell2
Cell3

So for ex: if you delete cell2 from all the rows, you need to update the rowIndex (x:r) in the remaining cells. Let’s say you delete column B , the updated XML should look like this:

<x:row r="1" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A1">
<x:v>1</x:v>
</x:c>
<x:c r="B1">
<x:v>3</x:v>
</x:c>
</x:row>

<x:row r="2" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A2">
<x:v>1</x:v>
</x:c>
<x:c r="B2">
<x:v>3</x:v>
</x:c>
</x:row>

And here is the C# code:

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

using DocumentFormat.OpenXml;

using DocumentFormat.OpenXml.Packaging;

using DocumentFormat.OpenXml.Spreadsheet;

using System.Xml.Linq;

 

namespace DeleteSpreadSheetColumn

{

    class Program

    {

        static XDocument columnsXL;

        static void Main(string[] args)

        {

 

            LoadColumnName("");

            string result = DeleteAColumn(@"C:\Users\abhatia\Desktop\Test1.xlsx", "Sheet1", "I");

        }

 

        public static void LoadColumnName(string XMLFile)

        {

            columnsXL = XDocument.Load(@"C:\Users\abhatia\Desktop\ExcelColumn.xml");

 

        }

        // Given a document, a worksheet name, a column name

        // deletes a column from a specified worksheet.

        public static string DeleteAColumn(string docName, string sheetName, string colName)

        {

            // Open the document for editing.

            using (SpreadsheetDocument document = SpreadsheetDocument.Open(docName, true))

            {

                IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().Where(s => s.Name == sheetName);

                if (sheets.Count() == 0)

                {

                    // The specified worksheet does not exist.

                    return sheetName + "doesn't exist";

                }

                string relationshipId = sheets.First().Id.Value;

                WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);

 

 

                // Get the Total Rows

                IEnumerable<Row> rows = worksheetPart.Worksheet.GetFirstChild<SheetData>().Elements<Row>();

                if (rows.Count() == 0)

                {

                    return "Rows doesn't exist";

 

                }

 

                // Loop through the rows and adjust Cell Index

                foreach (Row row in rows)

                {

                    int index = (int)row.RowIndex.Value;

                    IEnumerable<Cell> cells = row.Elements<Cell>();

 

                    IEnumerable<Cell> cellToDelete = cells.Where(c => string.Compare(c.CellReference.Value, colName + index, true) == 0);

                    if (cellToDelete.Count() > 0)

                        cellToDelete.First().Remove();

 

                    int intColumnToBeDel = (int)(from c in columnsXL.Descendants("col") where c.Attribute("name").Value == colName select c.NodesBeforeSelf().Count()).FirstOrDefault() + 1;

 

 

                    foreach (Cell cell in cells)

                    {

 

                        string col = GetColunName(cell.CellReference.Value);

                        int intColumnPres = (int)(from c in columnsXL.Descendants("col") where c.Attribute("name").Value == col select c.NodesBeforeSelf().Count()).FirstOrDefault() + 1;

 

                        if (intColumnPres > intColumnToBeDel)

                        {

 

                            string ColNamefromXML = ((System.Xml.Linq.XElement)(from c in columnsXL.Descendants("col")

                                                                                where c.Attribute("name").Value + index == cell.CellReference.Value

                                                                                select c.PreviousNode).FirstOrDefault()).FirstAttribute.Value;

                            cell.CellReference.Value = ColNamefromXML + index;

 

                        }

                    }

                }

 

                worksheetPart.Worksheet.Save();

                return "Deleted";

 

            }

        }

 

 

 

        public static bool isNumeric(string val, System.Globalization.NumberStyles NumberStyle)

        {

            Double result;

            return Double.TryParse(val, NumberStyle,

                System.Globalization.CultureInfo.CurrentCulture, out result);

        }

 

        private static string GetColunName(string p)

        {

            Boolean breakLoop = true;

            while (breakLoop)

            {

                if (isNumeric(p.Last().ToString(), System.Globalization.NumberStyles.Integer))

                    p = p.Substring(0, p.Length - 1);

                else

                    breakLoop = false;

            }

            return p;

        }

    }

}

 

Please note that ExcelColumn.xml is just an XML file which contains all the column name. I find it an easy way to get the column name. You can download the complete project and XML file from http://cid-956eb159f7975780.skydrive.live.com/redir.aspx?resid=956EB159F7975780!197

Feel free to optimize it according to your need and let me know if you found any problem with this one!!!

Advertisements

Posted in Open XML SDK, Spreadsheet | Tagged: , , , | 3 Comments »

Open XML SDK Learning Resources

Posted by Ankush on February 24, 2011

Open XML SDK 2.0 FAQ
http://social.msdn.microsoft.com/Forums/en-US/oxmlsdk/thread/19032efe-80af-4a16-b954-1f5a6d2f2236

Building Office Open XML Files
http://msdn.microsoft.com/en-us/magazine/cc163478.aspx

Introducing the Office (2007) Open XML File Formats
http://msdn2.microsoft.com/en-us/library/ms406049.aspx

How to: Manipulate Office Open XML Formats Documents
http://msdn2.microsoft.com/en-us/library/aa982683.aspx#Office2007ManipulatingXMLDocs_creating

Building Server-Side Document Generation Solutions Using the Open XML Object Model
http://msdn.microsoft.com/en-us/library/bb735940.aspx (Part 1 of 2)
http://msdn.microsoft.com/en-us/library/bb735939.aspx (Part 2 of 2)

How to use the Office XML file format and the packaging components from the .NET Framework 3.0 to create a simple Excel 2007 workbook or a simple Word 2007 document
http://support.microsoft.com/?id=931866

How Do I…Open XML SDK v2.0
http://msdn.microsoft.com/en-us/library/bb491088(office.14).aspx

Nanipulating Excel 2007 and PowerPoint 2007 Files with the Open XML Object Model (Part 1 of 2)
http://msdn.microsoft.com/en-us/library/bb739834.aspx

Manipulating Excel 2007 and PowerPoint 2007 Files with the Open XML Object Model (Part 2 of 2)
http://msdn.microsoft.com/en-us/library/bb727373.aspx

Reading Data from SpreadsheetML
http://blogs.msdn.com/brian_jones/archive/2008/11/10/reading-data-from-spreadsheetml.aspx

How to use components of the .NET Framework 3.0 to create and then to stream an Office Word 2007 document and an Office Excel 2007 workbook to a client computer
http://support.microsoft.com/kb/932921

Welcome to the Open XML Developer Workshop
http://msdn.microsoft.com/en-us/office/bb738430.aspx

Open XML Explained e-book
http://openxmldeveloper.org/articles/1970.aspx

Good Site to Search ECMA standard online :
http://www.documentinteropinitiative.org/default.aspx

Open XML SDK Documentation
http://msdn.microsoft.com/en-us/office/ee441239.aspx

Open XML Format SDK 2.0: Getting Started Best Practices
http://blogs.msdn.com/erikaehrli/archive/2009/05/14/open-xml-format-sdk-2-0-getting-started-best-practices.aspx

Office Open XML Formats: Retrieving Lists of Excel 2007 Worksheets
http://msdn.microsoft.com/en-us/library/bb332456.aspx

Manipulating Excel 2007 and PowerPoint 2007 Files with the Open XML Format API (Part 1 of 2)
http://msdn.microsoft.com/en-us/library/bb739834.aspx

Manipulating Excel 2007 and PowerPoint 2007 Files with the Open XML Format API (Part 2 of 2)
http://msdn.microsoft.com/en-us/library/bb727373.aspx

2007 Office System Sample: Open XML Format SDK 2.0 Code Snippets for Visual Studio 2008
http://www.microsoft.com/downloads/details.aspx?FamilyID=78bea298-a3f9-44cf-bde0-b4f30dc986df&displaylang=en

Writing Large Excel Files with the Open XML SDK
http://blogs.msdn.com/b/brian_jones/archive/2010/06/22/writing-large-excel-files-with-the-open-xml-sdk.aspx

Parsing and Reading Large Excel Files with the Open XML SDK
http://blogs.msdn.com/b/brian_jones/archive/2010/05/27/parsing-and-reading-large-excel-files-with-the-open-xml-sdk.aspx

How to efficiently generate OpenXML Workbook from Excel files containing very large amount of data
http://blogs.msdn.com/b/vsod/archive/2010/05/31/how-to-efficiently-generate-openxml-workbook-from-excel-files-containing-very-large-amount-of-data.aspx

Very very very good blogs to through..Its must if you would like to know more about formats:

Where is the documentation for Office’s docx/xlsx/pptx formats? Part 1: Office 2007
http://blogs.msdn.com/b/chrisrae/archive/2010/09/25/where-is-the-documentation-for-office-2010-s-docx-xlsx-pptx-formats.aspx

Where is the documentation for Office’s docx/xlsx/pptx formats? Part 2: Office 2010
http://blogs.msdn.com/b/chrisrae/archive/2010/10/06/where-is-the-documentation-for-office-s-docx-xlsx-pptx-formats-part-2-office-2010.aspx

I’ll keep on updating this list and will try to categorize it as it becomes bigger 🙂

Posted in Open XML SDK | Tagged: , , , , | 4 Comments »

Custom UI part In Word and Open XML SDK 2.0

Posted by Ankush on February 24, 2011

Consider a scienrio you have a Word 2010 document in which you aretrying to custmization by adding custom UI part and you want to open this document in Open XML SDK, it WILL NOT WORK.

Why….?

Lets try this:

Create a new Word 2010 document and add a custom UI part [ You can use custom UI editor tool to do this]. Use the following XML to do the ribbon custmization.

<?xml version=”1.0″ encoding=”UTF-8″?>
<customUI xmlns=”http://schemas.microsoft.com/office/2009/07/customui“>
<ribbon>
<tabs>
<tab idMso=”TabDeveloper” visible=”false” />
<tab idMso=”TabReferences” visible=”false” />
</tabs>
</ribbon>
</customUI>

Now if you open this document using Open XML SDK 2.0, you will get an error message.

WordprocessingDocument doc = WordprocessingDocument.Open(“<Path of the document>”, false);

Exception Message : “Specified argument was out of the range of valid values.Parameter name: relationshipType”

Call Stack:

DocumentFormat.OpenXml.Packaging.WordprocessingDocument.CreatePartCore(String relationshipType)
DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.CreateOpenXmlPart(String relationshipType)DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.LoadReferencedPartsAndRelationships(OpenXmlPackage openXmlPackage, OpenXmlPart sourcePart, ackageRelationshipCollection
relationshipCollection, Dictionary`2 loadedParts)
DocumentFormat.OpenXml.Packaging.OpenXmlPackage.Load()
DocumentFormat.OpenXml.Packaging.OpenXmlPackage.OpenCore(Stream stream, Boolean readWriteMode)
DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open(Stream stream, Boolean isEditable, OpenSettings openSettings)DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open(Stream stream, Boolean isEditable)

Lets get into more detail.

If you look into the callstack, it is failing in CreatePartCore function. So I used .net reflector tool and anlyzed the Open XML SDK assembly i.e. defination of CreatePartCore. This is what I see:

The relationship namespace for Office 2010 is “http://schemas.microsoft.com/office/2007/relationships/ui/extensibility” and if you look at the image, it is not included. In result, you get an exception.

This is all about the problem.

Solution: The only solution which will work her to create an addin and then do the ribbon customization.

Posted in Open XML SDK, Word | Tagged: , , , | 1 Comment »

 
%d bloggers like this: