Lets Learn

Opinion Matters

Archive for the ‘Open XML SDK’ Category

How to Delete a Column in Excel using Open XML SDK

Posted by Ankush on June 13, 2011

Recently there was a question posted on my blog as how we can delete a column in spreadsheet (Excel) using Open XML SDK. Actually, it’s a very tricky requirement because you need to update so much of information i.e. changing the cell index. So it’s really depends how you want to achieve it. You may want to write a COM Addin/VSTO addin to do the job but then you need to deploy the add-in at client side. So, if you are looking for a server side solution, here is the approach. You may want to optimize it according to your need but I will explain in detail what/why I did.

So, let’s start with the concept. Let’s say have created a sample Excel workbook like this:

clip_image001

If you look the XML representation of the data, it will be like this

<x:sheetData xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">

<x:row r="1" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A1">
<x:v>1</x:v>
</x:c>
<x:c r="B1">
<x:v>2</x:v>
</x:c>
<x:c r="C1">
<x:v>3</x:v>
</x:c>
</x:row>

<x:row r="2" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A2">
<x:v>1</x:v>
</x:c>
<x:c r="B2">
<x:v>2</x:v>
</x:c>
<x:c r="C2">
<x:v>3</x:v>
</x:c>
</x:row>
</x:sheetData>

So pay attention to how rows and cells are declared. So the declaration is:

Row1
Cell1
Cell2
Cell3

Row2
Cell1
Cell2
Cell3

So for ex: if you delete cell2 from all the rows, you need to update the rowIndex (x:r) in the remaining cells. Let’s say you delete column B , the updated XML should look like this:

<x:row r="1" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A1">
<x:v>1</x:v>
</x:c>
<x:c r="B1">
<x:v>3</x:v>
</x:c>
</x:row>

<x:row r="2" spans="1:3" x14ac:dyDescent="0.25" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<x:c r="A2">
<x:v>1</x:v>
</x:c>
<x:c r="B2">
<x:v>3</x:v>
</x:c>
</x:row>

And here is the C# code:

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

using DocumentFormat.OpenXml;

using DocumentFormat.OpenXml.Packaging;

using DocumentFormat.OpenXml.Spreadsheet;

using System.Xml.Linq;

 

namespace DeleteSpreadSheetColumn

{

    class Program

    {

        static XDocument columnsXL;

        static void Main(string[] args)

        {

 

            LoadColumnName("");

            string result = DeleteAColumn(@"C:\Users\abhatia\Desktop\Test1.xlsx", "Sheet1", "I");

        }

 

        public static void LoadColumnName(string XMLFile)

        {

            columnsXL = XDocument.Load(@"C:\Users\abhatia\Desktop\ExcelColumn.xml");

 

        }

        // Given a document, a worksheet name, a column name

        // deletes a column from a specified worksheet.

        public static string DeleteAColumn(string docName, string sheetName, string colName)

        {

            // Open the document for editing.

            using (SpreadsheetDocument document = SpreadsheetDocument.Open(docName, true))

            {

                IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().Where(s => s.Name == sheetName);

                if (sheets.Count() == 0)

                {

                    // The specified worksheet does not exist.

                    return sheetName + "doesn't exist";

                }

                string relationshipId = sheets.First().Id.Value;

                WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);

 

 

                // Get the Total Rows

                IEnumerable<Row> rows = worksheetPart.Worksheet.GetFirstChild<SheetData>().Elements<Row>();

                if (rows.Count() == 0)

                {

                    return "Rows doesn't exist";

 

                }

 

                // Loop through the rows and adjust Cell Index

                foreach (Row row in rows)

                {

                    int index = (int)row.RowIndex.Value;

                    IEnumerable<Cell> cells = row.Elements<Cell>();

 

                    IEnumerable<Cell> cellToDelete = cells.Where(c => string.Compare(c.CellReference.Value, colName + index, true) == 0);

                    if (cellToDelete.Count() > 0)

                        cellToDelete.First().Remove();

 

                    int intColumnToBeDel = (int)(from c in columnsXL.Descendants("col") where c.Attribute("name").Value == colName select c.NodesBeforeSelf().Count()).FirstOrDefault() + 1;

 

 

                    foreach (Cell cell in cells)

                    {

 

                        string col = GetColunName(cell.CellReference.Value);

                        int intColumnPres = (int)(from c in columnsXL.Descendants("col") where c.Attribute("name").Value == col select c.NodesBeforeSelf().Count()).FirstOrDefault() + 1;

 

                        if (intColumnPres > intColumnToBeDel)

                        {

 

                            string ColNamefromXML = ((System.Xml.Linq.XElement)(from c in columnsXL.Descendants("col")

                                                                                where c.Attribute("name").Value + index == cell.CellReference.Value

                                                                                select c.PreviousNode).FirstOrDefault()).FirstAttribute.Value;

                            cell.CellReference.Value = ColNamefromXML + index;

 

                        }

                    }

                }

 

                worksheetPart.Worksheet.Save();

                return "Deleted";

 

            }

        }

 

 

 

        public static bool isNumeric(string val, System.Globalization.NumberStyles NumberStyle)

        {

            Double result;

            return Double.TryParse(val, NumberStyle,

                System.Globalization.CultureInfo.CurrentCulture, out result);

        }

 

        private static string GetColunName(string p)

        {

            Boolean breakLoop = true;

            while (breakLoop)

            {

                if (isNumeric(p.Last().ToString(), System.Globalization.NumberStyles.Integer))

                    p = p.Substring(0, p.Length - 1);

                else

                    breakLoop = false;

            }

            return p;

        }

    }

}

 

Please note that ExcelColumn.xml is just an XML file which contains all the column name. I find it an easy way to get the column name. You can download the complete project and XML file from http://cid-956eb159f7975780.skydrive.live.com/redir.aspx?resid=956EB159F7975780!197

Feel free to optimize it according to your need and let me know if you found any problem with this one!!!

Advertisements

Posted in Open XML SDK, Spreadsheet | Tagged: , , , | 3 Comments »

libOPC version 0.0.1 is released

Posted by Ankush on April 25, 2011

Source: http://blogs.msdn.com/b/dmahugh/archive/2011/04/19/libopc-version-0-0-1-released.aspx

Text taken from above article:

The first release of libOPC, a new API for Open XML development, was published on Codeplex last week. This API is the first open-source cross-platform API for developers working with Open Packaging Convention (OPC) packages as used by Open XML, XPS, and other formats. Full source code is available, and it’s written in portable C99, so can be used on all popular variants of Linux/Unix, Mac OS, Windows, Android, and many other platforms. The API uses other common cross-platform open-source APIs for some of the low-level details, including ZLIB for opening ZIP-compressed packages and libXMLfor parsing the XML streams from the parts in the package.

Historically, there have been two popular .NET APIs for Open XML development: System.IO.Packaging (which first appeared in .NET 3.0) and the Open XML SDK, released in early 2007. There’s also a COM-based native packaging API available for non-.NET Windows developers.

The libOPC API is roughly analogous to System.IO.Packaging, in that it’s focused on the details of OPC and MCE (parts 2 and 3 of IS 29500), but doesn’t provide higher-level abstractions for WordprocessingML, SpreadsheetML or PresentationML (as covered in parts 1 and 4 of IS 29500).  I say “roughly” because libOPC doesn’t yet address some of the things that System.IO.Packaging handles (e.g., digital signatures) but does include some more advanced capabilities not available in System.IO.Packaging, such as the opc_generate functionality described below, which is essentially the same as the document reflector functionality of the Open XML SDK.

The key new feature in libOPC is its cross-platform capabilities. If you’re working on a non-Microsoft platform, or working with embedded systems that have limited OS support for XML and ZIP, you now have a very fast, simple API that you can use to implement Open XML read and write capabilities in your applications. And libOPC is designed from the ground up to be wrapper-friendly, for use from programming languages other than C.

For full read, click on the above link!!

Posted in Open XML SDK | Tagged: , , | Leave a Comment »

How to Insert a Hyperlink in Excel using Open XML SDK 2.0

Posted by Ankush on March 4, 2011

Consider a scenario, you want to insert a hyperlink in Excel 2010 (or Excel 2007) file programmatically using a web application. You should not use Office Object Model in a web application, Please refer http://support.microsoft.com/kb/257757. So what we can do??

As we know that Office 2007 and 2010 file format is XML based (just rename any Office 2007/2010 document to .zip and you can see a lots of XML files which makes a document and they are simply zipped in a one file). To manipulate these files, MS has introduced a library, Open XML SDK 2.0. The SDK is built on the System.IO.Packaging API and provides strongly-typed classes to manipulate documents that adhere to the Office Open XML File Formats Specification. The Office Open XML File Formats specification is an open, international, ECMA-376, Second Edition and ISO/IEC 29500 standard. The Open XML file formats are useful for developers because they are an open standard and are based on well-known technologies: ZIP and XML.

The Open XML SDK 2.0 simplifies the task of manipulating Open XML packages and the underlying Open XML schema elements within a package. The Open XML SDK 2.0 encapsulates many common tasks that developers perform on Open XML packages, so that you can perform complex operations with just a few lines of code. Some benefits if you use Open XML SDK:

Open Format: These formats use XML and ZIP, and they are fully documented. Anyone can get the full specs on the formats and it is a royalty free license for anyone that wants to work with the files.

Compressed: Files saved in these new XML formats are less than 50% the size of the equivalent file saved in the binary formats. This is because we take all of the XML parts that make up any given file, and then we ZIP them. We chose ZIP because it’s already widely in use today and we wanted these files to be easy to work with.

Robust: Between the usage of XML, ZIP, and good documentation the files get a lot more robust. By compartmentalizing our files into multiple parts within the ZIP, it becomes a lot less likely that an entire file will be corrupted (instead of just individual parts). The files are also a lot easier to work with, so it’s less likely that people working on the files outside of Office will cause corruptions.

Backward compatible: You don’t have to use the new version of Office to take advantage of these formats. You can use Office Compatibility pack to open the Office 2007 document in older version.

Binary Format support: You can still use the current binary formats with the new version of Office. In fact, people can easily change to use the binary formats as the default if that’s what they’d rather do.

New Extensions: The new formats has new extensions (.docx, .pptx, .xlsx) so you can tell what format the files you are dealing with are, but to the average end user they’ll still just behave like any other Office file. Double click & it opens in the right application.

Is also important to know that what SDK CAN’T do:

The Open XML SDK 2.0:

  • Does not replace the Microsoft Office Object Model and provides no abstraction on top of the file formats. You must still understand the structure of the file formats to use the Open XML SDK 2.0.
  • Does not provide functionality to convert Open XML formats to and from other formats, such as HTML or XPS.
  • Does not guarantee document validity of Open XML Formats when you use the Open XML SDK 2.0 or if you decide to manipulate the underlying XML directly.
  • Does not provide application behavior such as layout functionality in Word or recalculation, data refresh, or adjustment functionalities in Excel.

So lets have a look at the solution. I will try to build the solution from the scratch so that you can learn how you can use Open XML tools and leverage this information in other exercise as well.

So lets start!!

Step 1: Download the Open XML SDK and the tool from http://www.microsoft.com/downloads/en/details.aspx?FamilyId=C6E744E5-36E9-45F5-8D8C-331DF206E0D0&displaylang=en. Please note that there are 2 links, one is for the tool and the second one is the actual library.

Step 2: So the objective is to insert a link into a cell. For the simplicity, we will assume that we want to insert the link into cell A1 of Sheet1. Lets create an Excel 2010 workbook and place a hyperlink in cell A1 manually. Once done, Excel file will look like this

clip_image001

Save this file and close it.

Step 3: If you haven’t changed the default installation location of the tool, they will appear here: C:\Program Files (x86)\Open XML SDK\V2.0\tool. Go ahead and run OpenXmlSdkTool.exe from here.

Step 4: You can simply open Office file in this tool and look at the different XML parts. (To understand different XML parts, have a look:https://ankushbhatia.wordpress.com/2010/03/04/basics-of-spreadsheet/). So go ahead and open the file in this tool. Once done, it will look like this:

clip_image002

Step 5: We know that we inserted the hyperlink in cell A1 of Sheet1 so go ahead and browse the Sheet1.xml file. Once you select the Sheet1, it gives you different child elements of it. Please select hyperlinks and click on “reflect Code” to view the XML. Make sure to follow the correct tree structure to get to Sheet1.xml and hyperlink object. After this, it will appear as:

clip_image003

Pay attention to the right side. There are 2 windows. Upper window shows XML of the section you have selected in the “Document Explorer” and the bottom window shows the C# code to generate the XML..Kind of cool….isn’t?? So now what we have..? We have XML which is required to generate a hyperlink and the equivalent C# code. Lets have a closure look at the XML

<x:hyperlinks xmlns:x=”http://schemas.openxmlformats.org/spreadsheetml/2006/main”&gt;

<x:hyperlink ref=”A1″ r:id=”rId1″ xmlns:r=”http://schemas.openxmlformats.org/officeDocument/2006/relationships&#8221; />

</x:hyperlinks>

There is no information about the hyperlink text. If you look at the relationship ID “A1″, it appears as

<?xml version=”1.0″ encoding=”UTF-8″ standalone=”true”?>

<Relationships xmlns=”http://schemas.openxmlformats.org/package/2006/relationships“><Relationship TargetMode=”External” Target=”http://www.microsoft.com/” Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink” Id=”rId1″/></Relationships>

So where is the actual text (remember I entered Microsoft)?

As you know that Excel keeps all the text strings in a separate XML file, SharedSTring.xml and refer it back based on the index. So lets see how the data appear in cell A1

Have a look at the SheetData [where it keeps all the data . It is specific to a sheet] in the tool

clip_image004

If you look at the XML, it contains the information about the row which contains the data.

<x:sheetData xmlns:x=”http://schemas.openxmlformats.org/spreadsheetml/2006/main”&gt;

<x:row r=”1″ spans=”1:1″ x14ac:dyDescent=”0.25″ xmlns:x14ac=”http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac”&gt;

<x:c r=”A1″ s=”1″ t=”s”>

<x:v>0</x:v>

</x:c>

</x:row>

</x:sheetData>

Lets try to decode it.

<x:row r=”1″ spans=”1:1″ x14ac:dyDescent=”0.25″ xmlns:x14ac=”http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac”&gt; :

r=1 means there is one row.

There is one child element <x:c r=”A1″ s=”1″ t=”s”>.

X:c -> Column
r=A1 -> Referenced cell is “A1″

S=1 and t=”S” it’s a string value and value is referenced from Shared String table. And S=1 means that a style is applied and the defination of the style is taken from index 1 which can be found from Style.xml

Now analyze the SharedSTring.XML using the same tool

clip_image005

Here you will see the Microsoft string. It is referred back in SheetData using x:v node. The starting index is 0.

So to summarize, we need to do:

  1. Create a hyperlink object and insert it before the PageMargin node. Please note that Excel is enforcing a strict ordering of the elements. So make sure to add it before PageMargines node.
  2. Create an entry in the SharedString Table
  3. Create an entry in the SheetData section

So why we did this exercise. Every now and then we face situation where we want to these kind of things in Office using Open XML. So easy thing is, do the stuff in UI and analyze the XML and C# code and then write your own.

Lets move on and utilize this information.

Step 6: Create a new Excel file and save it. We will insert a hyperlink in this file. Create a new VS 2010 project and open the new blank file.

Step 7: Add a reference to DocumentFormat.OpenXML and WindowsBase.dll from .NET tab

Step 8: Add a namespace using XL = DocumentFormat.OpenXml.Spreadsheet;

Step 9: So basically you need to

a. Open the workbook
b. Get to the sheet where you want to add the hyperlink
c. Add the hyperlink
d. Add an entry in SheetData
e. Add an entry in SharedString.xml

The below code does exactly that.

using (SpreadsheetDocument document = SpreadsheetDocument.Open(@"C:\Users\abhatia\Desktop\Sample1.xlsx", true))

           {

               IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().Where(s => s.Name == "Sheet1");

               if (sheets.Count() == 0)

               {

                   // The specified worksheet does not exist.

                   return;

               }

               string relationshipId = sheets.First().Id.Value;

               WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);


               Hyperlinks hyperlinks1 = new Hyperlinks();

               Hyperlink hyperlink1 = new Hyperlink() { Reference = "A1", Id = "UNIQUE" };

               hyperlinks1.Append(hyperlink1);


               PageMargins pageMargins = worksheetPart.Worksheet.Descendants<PageMargins>().First();

               worksheetPart.Worksheet.InsertBefore<Hyperlinks>(hyperlinks1, pageMargins);

               worksheetPart.Worksheet.Save();

               worksheetPart.AddHyperlinkRelationship(new System.Uri("http://www.google.com/", System.UriKind.Absolute), true, "UNIQUE");


               // Create an element in SheetData


               // Get the SharedStringTablePart. If it does not exist, create a new one.

               SharedStringTablePart shareStringPart;

               if (document.WorkbookPart.GetPartsOfType<SharedStringTablePart>().Count() > 0)

               {

                   shareStringPart = document.WorkbookPart.GetPartsOfType<SharedStringTablePart>().First();

               }

               else

               {

                   shareStringPart = document.WorkbookPart.AddNewPart<SharedStringTablePart>();

               }



               // Insert the text into the SharedStringTablePart.

               int index = InsertSharedStringItem("www.google.com", shareStringPart);


               // Insert cell A1 into the new worksheet.

               Cell cell = InsertCellInWorksheet("A", 1, worksheetPart);


               // Set the value of cell A1.

               cell.CellValue = new CellValue(index.ToString());

               cell.DataType = new EnumValue<CellValues>(CellValues.SharedString);


               // Save the new worksheet.

               worksheetPart.Worksheet.Save();



           }

 

Here are the helper functions (taken from MS site):

// Given text and a SharedStringTablePart, creates a SharedStringItem with the specified text 

// and inserts it into the SharedStringTablePart. If the item already exists, returns its index.

private static int InsertSharedStringItem(string text, SharedStringTablePart shareStringPart)

{

    // If the part does not contain a SharedStringTable, create one.

    if (shareStringPart.SharedStringTable == null)

    {

        shareStringPart.SharedStringTable = new SharedStringTable();

    }


    int i = 0;


    // Iterate through all the items in the SharedStringTable. If the text already exists, return its index.

    foreach (SharedStringItem item in shareStringPart.SharedStringTable.Elements<SharedStringItem>())

    {

        if (item.InnerText == text)

        {

            return i;

        }


        i++;

    }


    // The text does not exist in the part. Create the SharedStringItem and return its index.

    shareStringPart.SharedStringTable.AppendChild(new SharedStringItem(new DocumentFormat.OpenXml.Spreadsheet.Text(text)));

    shareStringPart.SharedStringTable.Save();


    return i;

}

 

// Given a column name, a row index, and a WorksheetPart, inserts a cell into the worksheet. 

       // If the cell already exists, returns it. 

       private static Cell InsertCellInWorksheet(string columnName, uint rowIndex, WorksheetPart worksheetPart)

       {

           Worksheet worksheet = worksheetPart.Worksheet;

           SheetData sheetData = worksheet.GetFirstChild<SheetData>();

           string cellReference = columnName + rowIndex;


           // If the worksheet does not contain a row with the specified row index, insert one.

           Row row;

           if (sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).Count() != 0)

           {

               row = sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).First();

           }

           else

           {

               row = new Row() { RowIndex = rowIndex };

               sheetData.Append(row);

           }


           // If there is not a cell with the specified column name, insert one.  

           if (row.Elements<Cell>().Where(c => c.CellReference.Value == columnName + rowIndex).Count() > 0)

           {

               return row.Elements<Cell>().Where(c => c.CellReference.Value == cellReference).First();

           }

           else

           {

               // Cells must be in sequential order according to CellReference. Determine where to insert the new cell.

               Cell refCell = null;

               foreach (Cell cell in row.Elements<Cell>())

               {

                   if (string.Compare(cell.CellReference.Value, cellReference, true) > 0)

                   {

                       refCell = cell;

                       break;

                   }

               }


               Cell newCell = new Cell() { CellReference = cellReference };

               row.InsertBefore(newCell, refCell);


               worksheet.Save();

               return newCell;

           }

       }

 

Assumptions:

I have made several assumptions while writing code for this.

  1. I have assumed that Excel file is a new blank file i.e. there is no hyperlink present.
  2. Sheet1 name is hardcoded.
  3. File Name is Hardcoded.
  4. Can you think of anything by looking at the code? 🙂

Run this code and everything will work. When you will open the Excel sheet and click on the link, it will be treated as hyperlink.

But still one thing is missing. It is not underlined and appear in the blue color. Because you need to apply a style to do that. I leave this exercise to my readers to check how the styles are applied and how they can use this information. In case of anything, let me know 🙂

Please try this approch in every situation where you use Open XML and let me know in case of any questions.

Posted in Open XML SDK, Spreadsheet | Tagged: , , , | 9 Comments »

Open XML SDK Learning Resources

Posted by Ankush on February 24, 2011

Open XML SDK 2.0 FAQ
http://social.msdn.microsoft.com/Forums/en-US/oxmlsdk/thread/19032efe-80af-4a16-b954-1f5a6d2f2236

Building Office Open XML Files
http://msdn.microsoft.com/en-us/magazine/cc163478.aspx

Introducing the Office (2007) Open XML File Formats
http://msdn2.microsoft.com/en-us/library/ms406049.aspx

How to: Manipulate Office Open XML Formats Documents
http://msdn2.microsoft.com/en-us/library/aa982683.aspx#Office2007ManipulatingXMLDocs_creating

Building Server-Side Document Generation Solutions Using the Open XML Object Model
http://msdn.microsoft.com/en-us/library/bb735940.aspx (Part 1 of 2)
http://msdn.microsoft.com/en-us/library/bb735939.aspx (Part 2 of 2)

How to use the Office XML file format and the packaging components from the .NET Framework 3.0 to create a simple Excel 2007 workbook or a simple Word 2007 document
http://support.microsoft.com/?id=931866

How Do I…Open XML SDK v2.0
http://msdn.microsoft.com/en-us/library/bb491088(office.14).aspx

Nanipulating Excel 2007 and PowerPoint 2007 Files with the Open XML Object Model (Part 1 of 2)
http://msdn.microsoft.com/en-us/library/bb739834.aspx

Manipulating Excel 2007 and PowerPoint 2007 Files with the Open XML Object Model (Part 2 of 2)
http://msdn.microsoft.com/en-us/library/bb727373.aspx

Reading Data from SpreadsheetML
http://blogs.msdn.com/brian_jones/archive/2008/11/10/reading-data-from-spreadsheetml.aspx

How to use components of the .NET Framework 3.0 to create and then to stream an Office Word 2007 document and an Office Excel 2007 workbook to a client computer
http://support.microsoft.com/kb/932921

Welcome to the Open XML Developer Workshop
http://msdn.microsoft.com/en-us/office/bb738430.aspx

Open XML Explained e-book
http://openxmldeveloper.org/articles/1970.aspx

Good Site to Search ECMA standard online :
http://www.documentinteropinitiative.org/default.aspx

Open XML SDK Documentation
http://msdn.microsoft.com/en-us/office/ee441239.aspx

Open XML Format SDK 2.0: Getting Started Best Practices
http://blogs.msdn.com/erikaehrli/archive/2009/05/14/open-xml-format-sdk-2-0-getting-started-best-practices.aspx

Office Open XML Formats: Retrieving Lists of Excel 2007 Worksheets
http://msdn.microsoft.com/en-us/library/bb332456.aspx

Manipulating Excel 2007 and PowerPoint 2007 Files with the Open XML Format API (Part 1 of 2)
http://msdn.microsoft.com/en-us/library/bb739834.aspx

Manipulating Excel 2007 and PowerPoint 2007 Files with the Open XML Format API (Part 2 of 2)
http://msdn.microsoft.com/en-us/library/bb727373.aspx

2007 Office System Sample: Open XML Format SDK 2.0 Code Snippets for Visual Studio 2008
http://www.microsoft.com/downloads/details.aspx?FamilyID=78bea298-a3f9-44cf-bde0-b4f30dc986df&displaylang=en

Writing Large Excel Files with the Open XML SDK
http://blogs.msdn.com/b/brian_jones/archive/2010/06/22/writing-large-excel-files-with-the-open-xml-sdk.aspx

Parsing and Reading Large Excel Files with the Open XML SDK
http://blogs.msdn.com/b/brian_jones/archive/2010/05/27/parsing-and-reading-large-excel-files-with-the-open-xml-sdk.aspx

How to efficiently generate OpenXML Workbook from Excel files containing very large amount of data
http://blogs.msdn.com/b/vsod/archive/2010/05/31/how-to-efficiently-generate-openxml-workbook-from-excel-files-containing-very-large-amount-of-data.aspx

Very very very good blogs to through..Its must if you would like to know more about formats:

Where is the documentation for Office’s docx/xlsx/pptx formats? Part 1: Office 2007
http://blogs.msdn.com/b/chrisrae/archive/2010/09/25/where-is-the-documentation-for-office-2010-s-docx-xlsx-pptx-formats.aspx

Where is the documentation for Office’s docx/xlsx/pptx formats? Part 2: Office 2010
http://blogs.msdn.com/b/chrisrae/archive/2010/10/06/where-is-the-documentation-for-office-s-docx-xlsx-pptx-formats-part-2-office-2010.aspx

I’ll keep on updating this list and will try to categorize it as it becomes bigger 🙂

Posted in Open XML SDK | Tagged: , , , , | 4 Comments »

Custom UI part In Word and Open XML SDK 2.0

Posted by Ankush on February 24, 2011

Consider a scienrio you have a Word 2010 document in which you aretrying to custmization by adding custom UI part and you want to open this document in Open XML SDK, it WILL NOT WORK.

Why….?

Lets try this:

Create a new Word 2010 document and add a custom UI part [ You can use custom UI editor tool to do this]. Use the following XML to do the ribbon custmization.

<?xml version=”1.0″ encoding=”UTF-8″?>
<customUI xmlns=”http://schemas.microsoft.com/office/2009/07/customui“>
<ribbon>
<tabs>
<tab idMso=”TabDeveloper” visible=”false” />
<tab idMso=”TabReferences” visible=”false” />
</tabs>
</ribbon>
</customUI>

Now if you open this document using Open XML SDK 2.0, you will get an error message.

WordprocessingDocument doc = WordprocessingDocument.Open(“<Path of the document>”, false);

Exception Message : “Specified argument was out of the range of valid values.Parameter name: relationshipType”

Call Stack:

DocumentFormat.OpenXml.Packaging.WordprocessingDocument.CreatePartCore(String relationshipType)
DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.CreateOpenXmlPart(String relationshipType)DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.LoadReferencedPartsAndRelationships(OpenXmlPackage openXmlPackage, OpenXmlPart sourcePart, ackageRelationshipCollection
relationshipCollection, Dictionary`2 loadedParts)
DocumentFormat.OpenXml.Packaging.OpenXmlPackage.Load()
DocumentFormat.OpenXml.Packaging.OpenXmlPackage.OpenCore(Stream stream, Boolean readWriteMode)
DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open(Stream stream, Boolean isEditable, OpenSettings openSettings)DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open(Stream stream, Boolean isEditable)

Lets get into more detail.

If you look into the callstack, it is failing in CreatePartCore function. So I used .net reflector tool and anlyzed the Open XML SDK assembly i.e. defination of CreatePartCore. This is what I see:

The relationship namespace for Office 2010 is “http://schemas.microsoft.com/office/2007/relationships/ui/extensibility” and if you look at the image, it is not included. In result, you get an exception.

This is all about the problem.

Solution: The only solution which will work her to create an addin and then do the ribbon customization.

Posted in Open XML SDK, Word | Tagged: , , , | 1 Comment »

Content Controls- A Complete Summary

Posted by Ankush on October 22, 2010

While generating a document from a database, it is very important that how you have designed the document. If you have a very good template then the insertion of the data becomes very easy. Word provides a very nice feature “Content Controls” to make this task easy. This blog post provides a collection of links which you can use .

So what is a Content Control. Here I what MSDN doc says:

Content controls are bounded and potentially labeled regions in a document that serve as containers for specific types of content. Individual content controls can contain content such as dates, lists, or paragraphs of formatted text. In some cases, content controls might remind you of forms. However, they are much more powerful, flexible, and useful because they enable you to create rich, structured blocks of content. Content controls also build on the custom XML support introduced in Microsoft Office Word 2003. Content controls enable you to author templates that insert well-defined blocks into your documents. Content controls enable you to:

• Specify structured regions in a template. Each structured region has its own unique ID so that you can read from and write to it. Examples of types of structured regions (or content controls) are combo boxes, pictures, text blocks, and calendars.

• Determine the behavior of content controls. Each content control takes up a portion of a document and, as the template author, you can specify what each region does. For example, if you want a region of your template to be a calendar, you insert a calendar content control in that area of the document, which automatically determines what that block of content does. Similarly, if you want a section of a template to display an image, create a picture content control in that area. In this way, you can build a template with predefined block types.

• Restrict the content of content controls. Each content control can be restricted, so that it cannot be deleted or edited. This is useful if, for example, you have copyright information in a template that the user should be able to read but not edit. You can also lock a template’s content so that a user does not accidentally delete portions of it. This makes templates more robust than in previous versions.

• Map the contents of a content control to data in a custom XML part that is stored with the document. For example, if you insert a document parts content control that contains a table of stock prices, you can map the table cells to nodes in an XML file that contain the current stock prices. When the prices change, an add-in can programmatically update the attached XML file, which is bound to each cell, and the new, updated prices automatically appear in the table.

The easiest way to create a content control is through the user interface (although you can also create them programmatically). To create a content control through the user interface (UI), select the text that you want to turn into a content control and then choose the content control type you want from the content controls section of the Developer ribbon. This creates a content control around the selected text.

Content Controls are a powerful features when it comes to feeding the data into document in an ASP.NET app. If you have created a document template and content control’s mapping is set then you can just replace one XML file and Word will pick the new data.

So here is the list.

Goal: Get the data from SQL Server and feed the Word document. The application is a Web application.

Solution: Create a Word template with content controls.  Bound them with an XML file. Use Open XML to manipulate the document and change the content of the XML file. The advantage of using content control is, they can be easily bound  to an XML file.

• Create a new Word document, drag few content controls onto the document surface. Create an XML file and create a node for every content control

Data-driven document generation with Word 2007 and the Office XML File Formats: Part 1
http://blogs.msdn.com/erikaehrli/archive/2006/08/11/word2007DataDocumentGenerationPart1.aspx

• There are 2 ways by which you do the mapping [ Nodes <-> Content Control]. Using Code, which is already  explained in the above article. If you would like to use an application, there is sample application available which can help you.

Word 2007 Content Control Toolkit
http://www.codeplex.com/wikipage?ProjectName=dbeOnce this is done, the sample XML file which you created, will be a part of the  document package, so you just need to change the content of the XML nodes and when you will open the document, it will reflect the dynamic data.

• Now you need to use Open XML, get the data from SQL and replace the XML contents it in the document. Following article  explains this[It has the sample code as well]

Data-driven document generation with Word 2007 and the Office XML File Formats: Part 2
http://blogs.msdn.com/erikaehrli/archive/2006/08/11/word2007DataDocumentGenerationPart2.aspxCreting documents by using Open XML Format SDK 2.0 (Part 3 of 3)
http://msdn.microsoft.com/en-us/library/dd469465.aspx#CreateDocswithOpenXMLSDK_TakingAdvantageofBoundContentControlsIf you would like to do this through Open XML SDK 2.0, then following article explains about this.

Creating Data-Bound Content Controls using the Open XML SDK and LINQ to XML
http://blogs.msdn.com/ericwhite/archive/2008/10/19/creating-data-bound-content-controls-using-the-open-xml-sdk-and-linq-to-xml.aspxTaking Advantage of Bound Content Controls
http://blogs.msdn.com/brian_jones/archive/2009/01/05/taking-advantage-of-bound-content-controls.aspx
So basically once the template is created, you can use Open XML SDK and do something like. Here I simply get the xml, delete the old XML file and add the new one.

========
string templatePath = Server.MapPath(“./Templates/ModelTemplate.docx”);
string custXMLFilePath = Server.MapPath(“./CustomXML/Model.xml”);
using (wdDoc = OpenXML.WordprocessingDocument.Open(templatePath, true))
{
OpenXML.MainDocumentPart mainPart = wdDoc.MainDocumentPart;
mainPart.DeleteParts<OpenXML.CustomXmlPart>(mainPart.CustomXmlParts);
XmlDataDocument custXML = new XmlDataDocument();
custXML.Load(custXMLFilePath);
OpenXML.CustomXmlPart custXMLPart = mainPart.AddNewPart<OpenXML.CustomXmlPart>();
using (Stream s = custXMLPart.GetStream(FileMode.Create))
{
custXML.Save(s);
}
}
=========
Inserting Repeating Data Items into a Word 2007 Table by Using the Open XML API
http://msdn.microsoft.com/en-us/library/cc197932.aspxCreating Valid Open XML Documents by Using the Validation Tools in the Open XML Format SDK
http://msdn.microsoft.com/en-us/library/dd633612.aspx

Let me know what you think about this::

Ankush

Posted in Open XML SDK, Word | Tagged: , , , | Leave a Comment »

AltChunk – An Interesting Scenario

Posted by Ankush on October 22, 2010

Merging multiple word documents into a single document is a very common requirement.  This  becomes a bit complex when you are trying to do this in a Web application which should not automates Office (http://support.microsoft.com/kb/257757). As an alternative, Microsoft has introduced Open XML SDK which you can use to read/write the data into Office application.  altChunk, which is a special feature of Open XML word processing markup that enables you to embed an entire Open XML document or an html page at a specific location in a document.

So basically there are 3 ways to merge the documents in a non-interactive environment:

1. Use AltChunk
2. Manually merge the document
3. Use Power Tools

By far the first option of using altChunks is the easiest method for merging multiple documents together. Not only can altChunks import other WordprocessingML documents, but it can also import html, xml, rtf, or plain text. Manually merging multiple documents together is feasible, but requires you to handle a number of issues. For example, you will need to manually merge and deal with conflicts related to styles, bullets and numbering, comments, headers and footers, etc.

Now lets consider a scenario and the solution:

Scenario:

Consider a scenario where I have merge multiple documents into one. These documents are coming from different sources and after merging, it should be sent to the management. The application should be a web application and no temp file should be generated.

Solution:

My solution contains

1. Test.docx the source document. They can be multiple which are supposed to be merged. In my example I am assuming its one
2. Test1.docx. The merged document which contains the content control [kind of bookmark] so that source documents can be inserted easily. I will create a separate blog as how you work with content control

Here is the complete code. Remarks are added to explain the code and the flow.

// Function which Opens the Document and Modify it

private static void OpenAndModifyDocument()
{

//  Open the document using Stream and bytes
byte[] byteArray = File.ReadAllBytes(“Test.docx”);

using (MemoryStream mem = new MemoryStream())
{
mem.Write(byteArray, 0, (int)byteArray.Length);
// Modify the document. For ex:: I am inserting a Paragrapgh
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(mem, true))
{
wordDoc.MainDocumentPart.Document.Body.InsertAt(
new Paragraph(
new Run(
new Text(“Newly inserted paragraph.”))), 0);
}
SaveDocument(mem);

}
}
// Function to do Merging with the Stream

Private static void SaveDocument(MemoryStream ms)
{
// Test1.docx is the templates which contains different placeholders (basically a content control) where I am going to the different document streams
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(“Test1.docx”, true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
// Make sure to have a unique AltChunk Id
string altChunkId = “AltChunkId” + 30;
// Create an AltChunk element
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, altChunkId);
// Here we can also use FeedData method to pass the document stream into AltChunk but that requires a physical file . Hence I am using Streams
Stream chunkStream = chunk.GetStream(FileMode.Create, FileAccess.Write);
StreamWriter stringStream = new StreamWriter(chunkStream);
ms.WriteTo(chunkStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
// Find the Content Controls. I am assuming that there is only one content control but if have multiples then you just modify this query
SdtBlock sdt = mainPart.Document.Descendants<SdtBlock>().First();
OpenXmlElement parent = sdt.Parent;
// Insert the AltChunk element and remove the content control
parent.InsertAfter(altChunk, sdt);
sdt.Remove();
mainPart.Document.Save();

}

}

Please note that until a document that contains altChunk elements is opened and saved in Office, it still contains the altChunk parts, and not normal WordprocessingML markup of paragraphs, runs, and text elements.  The solution with SharePoint 2010 is that you can use Word Automation Services to update the documents that contain altChunk elements.  After Word Automation Services processes it, the document will contain paragraphs, runs, and text elements.

Let me know about your thoughts!!!

Posted in Open XML SDK, Word | Tagged: , , | 14 Comments »

How to Copy a Chart and a Slide in an Existing Presentation

Posted by Ankush on September 13, 2010

Recently I worked on a case where Customer wanted to copy a slide and a chart in an existing presentation using Open XML SDK.  In my earlier blogs, I also received  requests to show how to import chart with data, so this blog will cover this aspect as well using Open XML SDK 2.0

 Here are the basic steps:

  • Get a Slide which will work as a template. This slide also contains the template chart  which we want to copy
  • Copy the slide and insert the chart and the associated data

 Seems easy..isn’t ..so lets get into the code and see the implementation…If you wish to download the application, you can get it from here

Step 1:  Declare the following namespaces at the top of your code file

using DocumentFormat.OpenXml.Presentation;
using DocumentFormat.OpenXml.Packaging;
using Drawing = DocumentFormat.OpenXml.Drawing;
using DocumentFormat.OpenXml.Drawing;
using System.IO;

Step 2: Open the Presentation

PresentationDocument presentationDocument = PresentationDocument.Open(@”Test.pptx”, true)

Step 3: Given the SlideIndex (The slide Index which we want to copy), get the SlidePart

  if (presentationDocument == null)
            {
                throw new ArgumentNullException(“presentationDocument”);
            }

            // Use the CountSlides sample to get the number of slides in the presentation.
            int slidesCount = CountSlides(presentationDocument);

            if (slideIndex < 0 || slideIndex >= slidesCount)
            {
                throw new ArgumentOutOfRangeException(“slideIndex”);
            }

            // Get the presentation part from the presentation document.
            PresentationPart presentationPart = presentationDocument.PresentationPart;

            // Get the presentation from the presentation part.
            Presentation presentation = presentationPart.Presentation;

            // Get the list of slide IDs in the presentation.
            SlideIdList slideIdList = presentation.SlideIdList;

            // Get the slide ID of the specified slide
            SlideId slideId = slideIdList.ChildElements[slideIndex] as SlideId;

            // Get the relationship Id of the slide.
            string slideRelId = slideId.RelationshipId;

            SlidePart slidePartSource = (SlidePart)presentationPart.GetPartById(slideRelId);

Step 4: Get the Chart from the SlidePartSource

 string slideChartId = slidePartSource.Slide.CommonSlideData.ShapeTree.Descendants<DocumentFormat.OpenXml.Drawing.Charts.ChartReference>().First().Id;

Step 5: Add a new blank Slide

 SlidePart slidePartDes = presentationPart.AddNewPart<SlidePart>();

Step 6: As this point, slidePartDes contains nothing. So fill this with the data from   slidePartSource and attach a slideLayout

            slidePartDes.FeedData(slidePartSource.GetStream());
            slidePartDes.AddPart(slidePartSource.SlideLayoutPart);

Step 7: Insert the Chart and the associated Data

            ChartPart chartPart1 = slidePartDes.AddNewPart<ChartPart>(slideChartId);
            string chartRelId= slidePartDes.GetIdOfPart(chartPart1);

chartPart1.FeedData(slidePartSource.ChartParts.First().GetStream());
EmbeddedPackagePart embeddedPackagePart1 = chartPart1.AddNewPart<EmbeddedPackagePart>(“application/vnd.openxmlformats-officedocument.spreadsheetml.sheet”, “rId1”);

embeddedPackagePart1.FeedData(slidePartSource.ChartParts.First().EmbeddedPackagePart.GetStream());

Step 8:  As we know that Presentation.xml contains the slide list so we need to create an entry for the new slide here

 SlideId slideIdLast = slideIdList.ChildElements[slidesCount-1] as SlideId;   

            // Insert the new slide into the slide list .

            SlideId newSlideId = slideIdList.InsertAfter(new SlideId(), slideIdLast);
            newSlideId.Id = slideIdLast.Id + 1;
            newSlideId.RelationshipId = presentationPart.GetIdOfPart(slidePartDes);

Step 9: Save and close everything

 presentationPart.Presentation.Save();
presentationDocument.Close();

Helper function:

  public static int CountSlides(PresentationDocument presentationDocument)
        {

            // Check for a null document object.
            if (presentationDocument == null)
            {
                throw new ArgumentNullException(“presentationDocument”);
            }

            int slidesCount = 0;
            // Get the presentation part of document.

            PresentationPart presentationPart = presentationDocument.PresentationPart;
            if (presentationPart != null &&  presentationPart.Presentation != null)

            {

                // Get the Presentation object from the presentation part.

                Presentation presentation = presentationPart.Presentation;

                // Verify that the presentation contains slides.   

                if (presentation.SlideIdList != null)
                {

                    // Get the slide count from the slide ID list. 
                    slidesCount = presentation.SlideIdList.Elements<SlideId>().Count();
                }
            }

            // Return the slide count to the previous method.
            return slidesCount;
        }

Assumptions:

1) You have only one chart in the source slide. If you more then one, you can extract the specific chart by changing

 string slideChartId = slidePartSource.Slide.CommonSlideData.ShapeTree.Descendants<DocumentFormat.OpenXml.Drawing.Charts.ChartReference>().First().Id;

2) The position of the slide in the SlideIDList is last. If you wish to change the position, you need to change the code presented in the step 8.

Posted in Open XML SDK | Tagged: , | 2 Comments »

Ankush Bhatia’s Blog Map

Posted by Ankush on May 12, 2010

Its been very busy and I am not able to devote any time here. I have got lots of things in my plate/lots of blogable content but not able to share the information. Soon I will start writing the blog posts. This blog post is to give you a quick list of blogs which I have created on our Official blog site.

Blog Post 1:How to set the editing restrictions in Word using Open XML SDK 2.0

Recently I worked on a scenario where customer wanted to set the Document Editing Restrictions in Word 2007 using Open XML SDK 2.0. Document Editing Restrictions is nothing but Review | Protect Documents | Restrict Formatting and Editing option in MS Word 2007. It allows you to set different types of protection like  Readonly, Track changes, Comments and Filling in forms. Read more How to set the editing restrictions in Word using Open XML SDK 2.0

Blog Post 2:How to Sign the SignatureLine using Office Open XML

We can use PackageDigitalSignatureManager class available in System.IO.Packaging namespace to sign any part of an Open XML document. However, when we try to sign a document that has a signature line using this class, we will notice that though the document appears as signed, the signature line still appears as “needs to be signed”. In order to have the signature appear in the signature line, we will need to do some additional tasks which are mentioned in the resolution section below.  Read more How to Sign the SignatureLine using Office Open XML

Blog Post 3:How to merge two or more Word 2003 or Word 2007 XML documents into one Word 2007 XML document.

This blog explains how you can merge Word 2003/Word 2007 XMl documents into Word 2007 XML document using AltChunk

How to merge two or more Word 2003 or Word 2007 XML documents into one Word 2007 XML document.

Blog Post 4:Resizing the embedded OLE objects using .NET

 few days back, I’ve got a case where my customer wanted to resize the embedded OLE Object programmatically? In this case Excel was embedded in Visio, that embedded file added to the shapes collection in Visio but in reality its nothing but an OLE document and we want to resize the OLE document instead of Shape. Note that if you activate the Excel workbook, resize the OLE container window which contains in-place activated Excel ,close the activation mode- It will resize the OLE object but if you don’t activate the  Excel Object and try to resize it, it changes the shape property instead of the OLE Object. For more read Resizing the embedded OLE objects using .NET

Hopefully you will find them interesting!!1

In case if you would like me to create a blog on any Office programming issue, let me know @ ankush.bhatia@live.com

Posted in Open XML SDK | Tagged: , , , , , | 2 Comments »

Basics of Spreadsheet – How to Clear a Column – Part 3

Posted by Ankush on March 11, 2010

Now we understand the basics of Spreadsheet, lets discuss about a simple scenario. Consider this, I have an Excel sheet and I would like to clear a column. To understand what’s required to accomplish this, lets follow:

 Create a sample worksheet.

Extract the content and view Sheet1.xml / SharedString.XML 

Sheetdata( Sheet1.xml) SharedString.XML
<sheetData>
<row r=”1″ spans=”1:2″>
  <c r=”A1″ t=”s”>
        <v>0</v>
  </c>
  <c r=”B1″>
       <v>9</v>
  </c>
</row><row r=”2″ spans=”1:2″>
 <c r=”A2″ t=”s”>
   <v>1</v>
 </c>
 <c r=”B2″>
  <v>10</v>
 </c>
</row>

<row r=”3″ spans=”1:2″>
  <c r=”A3″>
   <v>1</v>
  </c>
 <c r=”B3″>
   <v>11</v>
 </c>
</row>

<row r=”4″ spans=”1:2″>
  <c r=”A4″>
   <v>2</v>
  </c>
 <c r=”B4″>
   <v>12</v>
 </c>
</row>

<row r=”5″ spans=”1:2″>
  <c r=”A5″>
   <v>3</v>
  </c>
 <c r=”B5″>
   <v>13</v>
 </c>
</row>

<row r=”6″ spans=”1:2″>
  <c r=”A6″>
   <v>4</v>
  </c>
 <c r=”B6″>
   <v>14</v>
 </c>
</row>

<row r=”7″ spans=”1:2″>
  <c r=”A7″>
   <v>5</v>
  </c>
 <c r=”B7″>
   <v>15</v>
 </c>
</row>
</sheetData>

<?xml version=”1.0″ encoding=”UTF-8″ standalone=”yes”?>
<sst xmlns=”http://schemas.openxmlformats.org/spreadsheetml/2006/main&#8221; count=”2″ uniqueCount=”2″>
  <si>
    <t>John</t>
  </si>
  <si>
    <t>Tom</t>
  </si>
</sst>

 

As I have explained in my previous blog that strings are referenced based on index in Sheet1.xml and the actual values are present in the sharedString XML that’s why you don’t see any string stored in SheetData. Now if you look  at the SheetData, you will notice that it is defined row by row basis. So basically If I want to clear a column, I have to delete an XML node for each row element. Let’s just examine one row entry in detail:

 <row r=”1″ spans=”1:2″>
      <c r=”A1″ t=”s”>
        <v>0</v>
      </c>
      <c r=”B1″>
        <v>9</v>
      </c>
    </row> 

If you notice that to refer a column, Excel uses A1 and B1 and every row is defined in that way. Let’s say, you want to clear column A, then you need to go through the every row, find c node (basically cell) which refers to A column and delete that node. Basically

Previous XML After XML
    <row r=”1″ spans=”1:2″>
      <c r=”A1″ t=”s”>
        <v>0</v>
      </c>
      <c r=”B1″>
        <v>9</v>
      </c>
    </row>
    <row r=”2″ spans=”1:2″>
      <c r=”A2″ t=”s”>
        <v>1</v>
      </c>
      <c r=”B2″>
        <v>10</v>
      </c>
    </row>
    <row r=”1″ spans=”1:2″>
      <c r=”A1″>
        <v>9</v>
      </c>
    </row>
    <row r=”2″ spans=”1:2″>
      <c r=”A2″>
        <v>10</v>
      </c>
    </row>

 

So now you understand the concept, let’s have a look at the sample code which clears a column A 

// Clear a column from XL

    string docName=””,  worksheetName=”Sheet1″;
    SpreadsheetDocument document = SpreadsheetDocument.Open(@”Book1.xlsx”, true);
    IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.Descendants<Sheet>().Where(s => s.Name == worksheetName);
    if (sheets.Count() == 0)
    {
        // The specified worksheet does not exist.
        return;
    }
    WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(sheets.First().Id);
    Worksheet worksheet = worksheetPart.Worksheet;
    SheetData sheetData = worksheet.GetFirstChild<SheetData>();
    var cells = sheetData.Descendants<Cell>().Where(s => s.CellReference.Value.Contains(“A”));
    foreach (var item in cells)
    {
        item.Remove();
    }
    worksheet.Save();

But are we done yet????..Still there are some strings left in SharedString which were referenced by column A. For the completeness you can delete them as well but even if you don’t delete them, Excel will do it for you. When you will open the modified sheet in Excel and save it, it will readjust the data for you!!!!!!

Let me know wht you think about this Post!!!!!!

Posted in Open XML SDK, Spreadsheet | 6 Comments »

 
%d bloggers like this: