Archive for the ‘Office Open XML’ Category

Alternate OOXML Document Generation Approach

March 31, 2011

Eric White has put out a document generation example which uses XPath and Word Content Controls.  I applaud Eric for the amount of work he has done with his exploration of different ways to perform template base generation.  This is a subject that is challenging and we need as many ideas as we can get.  There are a couple of areas that I see room for improvement in this XPath design that I would like to bring up. 

The first is that Eric has chosen to put his document generation in the document itself.  I see this as a maintenance and reusability issue.  Architecturally I would prefer to have my code external to the document so that I can write and maintain it centrally in a generic fashion and tie it to a rules engine.

Another place I see that this approach falls down is that it is good for simple text replacement, but it doesn’t handle formatting, replacing images or working with charts.  This doesn’t mean that it can’t handle them, but I think it would lose the simplicity which looks to be it’s appeal.

Lastly, Content Controls are currently a Word only feature.  It would be great if we could come up with a mark-up technique that was universal to all Office document types.  Hey, we can dream, right?

Personally, I prefer a more meta data driven approach based on my experience with solutions which had output that was more marketing material quality.  That being said, but approach is an interesting idea to add to the design arsenal.  Thanks for the thoughts Eric.


Update Since Microsoft/PSC Office Open XML Case Study

December 16, 2010

In 2009 Microsoft released a case study about a project that we had done using the OOXML SDK 1.0 for Research Directors Inc.  Since that time Microsoft has released version 2.0 of the SDK and PSC has done significant development with it.  Below are some of the mile stones we have reached since the original case study.

At the time of the original case study two report types had been automated to output as PowerPoint presentations.  Now that the all the main products have been delivered we have added three reports with Word document outputs and five more reports with PowerPoint outputs.

One improvement we made over the original application was to create a PowerPoint Add-In which allows the users to tag a slide.  These tags along with the strongly typed SDK 2.0 allows for the code to use LINQ to easily search for slides in the template files.  This allows for a more flexible architecture base on assembling a presentation from copied slide extracted from the template.

The new library we created also enabled us to create two new Word based reports in two weeks.  The library we created abstracts the generation of the documents from the business logic and the data retrieval.  The key to this is the mark up.  Content Controls are a good method for identifying sections of a template to be modified or replaced.  Join this with the concept of all data being generically either scalar or two dimensional and the code becomes more generic.

In the end we found the OOXML SDK 2.0 to be a great tool for accelerating document generation development and creating happy clients. 

Creating New Presentations from Slide Masters Using OpenXML (Revisited)

May 4, 2010

Here’s an update to a prior post, Creating New Presentations from Slide Masters Using OpenXML, about creating new presentation slides from the slide master.  A reader commented about an error with layouts that had images in them and it came to my attention that my code completely ignored the images.  Here’s an update:

private static void InsertSlide(PresentationPart pPart, string layoutName, UInt32 slideId)


            Slide slide = new Slide(new CommonSlideData(new ShapeTree()));

            SlidePart sPart = pPart.AddNewPart<SlidePart>();


            SlideMasterPart smPart = pPart.SlideMasterParts.First();

            SlideLayoutPart slPart = smPart.SlideLayoutParts.Single(kaark => kaark.SlideLayout.CommonSlideData.Name == layoutName);

            //Add the layout part to the new slide from the slide master


            sPart.Slide.CommonSlideData = (CommonSlideData)smPart.SlideLayoutParts.Single(kaark => kaark.SlideLayout.CommonSlideData.Name == layoutName).SlideLayout.CommonSlideData.Clone();


            using (Stream stream = slPart.GetStream())





            //UPDATED: Copy the images from the slide master layout to the new slide

            foreach (ImagePart iPart in slPart.ImageParts)


                ImagePart newImagePart = sPart.AddImagePart(iPart.ContentType, slPart.GetIdOfPart(iPart));




            SlideId newSlideId = pPart.Presentation.SlideIdList.AppendChild<SlideId>(new SlideId());

            newSlideId.Id = slideId;

            newSlideId.RelationshipId = pPart.GetIdOfPart(sPart);


After feeding the slide layout part data into the new slide’s layout part, the code then adds the new image parts to the slide from the slide master layout.

Experience OOXML In Person

April 21, 2010

The Chicago Code Camp is coming up on May 1st.  I will be presenting on the essentials of document generation with OOXML.  The code I will be showing will leverage the 2.0 SDK.  Join us and come with your questions about Open XML.

Dealing With Table Borders In OOXML

April 5, 2010

Formatting tables in a document programmatically can be a very complex task.  This is the major reason which we start our document generation projects with templates instead of building components in a document by hand.

Borders are on aspect of a table that you may want to fomat.  Borders are used to make certain content in a table stand out.  If you need to conditionally set and remove borders there is something that you need to be aware of.  Even in OOXML you have the concepts of styles, inheriting styles and overriding styles.

When Word defines a table it will reference a global style such as “TableGrid”.  This style will include the borders for the table.  Specifically the InsideHorizontalBorder and InsideVerticalBorder define the borders for the cells.  These can be overridden by the TableCellBorders collection of a particular cell.  Adding a double right border on a cell is as easy as the couple of lines of code below.

wordprocessing.TableCellBorders borders = new wordprocessing.TableCellBorders();

borders.RightBorder = new RightBorder(){Val = BorderValues.Double, Color = "000000", ThemeColor = ThemeColorValues.Text1, Size = (UInt32Value)4U, Space = (UInt32Value)0U };


If I want to revert back to the table’s style for cell borders I simply need to remove all children from the TableCellBorders collection.  It is like removing a class identifier from a TD tag in HTML.  The style in the parent object takes back over.

With the knowledge of how the borders work you can take the concept and apply it to other effects of styles.

Creating New Presentations from Slide Masters Using OpenXML

March 24, 2010

The slide master in Powerpoint provides a powerful way for end-users to easily control the appearance and layouts of a presentation.  A slide master contains a set of layouts that are subsequently used by the slides in the presentation.

A common approach to constructing  a new presentation is to have a template with slides that are then copied/merged into the new presentation.  The approach I will be demonstrating creates slides in the new presentation based off the slide master layouts in the template.  This approach still requires a template, but does not require slides to already exist. 

The Template

The template I used contained a few layouts in the slide master, each arranged with some placeholder objects.  A great benefit of layouts in the slide master is that they can be renamed through the UI.  The layout name is what will be used in the code to construct the slide deck in the new presentation.

The Code

Now the fun part.  The InsertSlide method takes a PresentationPart, layout name from the slide master, and ID for the new slide.  It creates the new Slide and adds the associated parts to the PresentationPart, copying all the required layout and common slide data from the slide master layout.

   1: private static void InsertSlide(PresentationPart pPart, string layoutName, UInt32 slideId)

   2:         {

   3:             Slide slide = new Slide(new CommonSlideData(new ShapeTree()));


   5:             SlidePart sPart = pPart.AddNewPart<SlidePart>();

   6:             slide.Save(sPart);


   8:             SlideMasterPart smPart = pPart.SlideMasterParts.First();

   9:             SlideLayoutPart slPart = smPart.SlideLayoutParts.Single(kaark => kaark.SlideLayout.CommonSlideData.Name == layoutName);

  10:             sPart.AddPart<SlideLayoutPart>(slPart);

  11:             sPart.Slide.CommonSlideData = (CommonSlideData)smPart.SlideLayoutParts.Single(kaark => kaark.SlideLayout.CommonSlideData.Name == layoutName).SlideLayout.CommonSlideData.Clone();

  12:             using (Stream stream = slPart.GetStream())

  13:             {

  14:                 sPart.SlideLayoutPart.FeedData(stream);

  15:             }


  17:             SlideId newSlideId = pPart.Presentation.SlideIdList.AppendChild<SlideId>(new SlideId());

  18:             newSlideId.Id = slideId;

  19:             newSlideId.RelationshipId = pPart.GetIdOfPart(sPart);

  20:         }

Since the presentation started with only a slide master in the template and no slides, a SlideIdList must be added to the PresentationPart.  Then start adding slides, using the layout names from the slide master.  Notice that the slide IDs were started at 256, that’s not a typo.  Slide IDs must be >= 256.

   1: using (PresentationDocument pDoc = PresentationDocument.Open(newFileCopiedFromTemplate, true))

   2:             {

   3:                 PresentationPart pPart = pDoc.PresentationPart;


   5:                 pPart.Presentation.SlideIdList = new SlideIdList();

   6:                 InsertSlide(pPart, "Layout1", 256);

   7:                 InsertSlide(pPart, "Layout3", 257);

   8:                 InsertSlide(pPart, "Layout3", 258);

   9:                 InsertSlide(pPart, "Layout2", 259);

  10:                 pPart.Presentation.Save();

  11:                 pDoc.Close();

  12:             }

Open XML SDK 2 Released

March 24, 2010

This post is a little late since the SDK was released about a week ago.  At PSC we have been using the Open XML SDK 2 since its earliest beta.  It is a very powerful tool for generating documents without using the Office DLLs.  It is also the main technology that I have been working with for the last six months.  I would suggest giving it a try. 

Stay tuned here.  In the near future I will be presenting at different locations on this and other document generation technologies.

Download the Open XML SDK here.

Copying A Slide From One Presentation To Another

March 20, 2010

There are many ways to generate a PowerPoint presentation using Open XML.  The first way is to build it by hand strictly using the SDK.  Alternately you can modify a copy of a base presentation in place.  The third approach to generate a presentation is to build a new presentation from the parts of an existing presentation by copying slides as needed.  This post will focus on the third option.

In order to make this solution a little more elegant I am going to create a VSTO add-in as I did in my previous post.  This one is going to insert Tags to identify slides instead of NonVisualDrawingProperties which I used to identify charts, tables and images.  The code itself is fairly short.

SlideNameForm dialog = new SlideNameForm();

Selection selection = Globals.ThisAddIn.Application.ActiveWindow.Selection;


if(dialog.ShowDialog() == DialogResult.OK)




Zeyad Rajabi has a good post here on combining slides from two presentations.  The example he gives is great if you are doing a straight merge.  But what if you want to use your source file as almost a supermarket where you pick and chose slides and may even insert them repeatedly?  The following code uses the tags we created in the previous step to pick a particular slide an copy it to a destination file.

using (PresentationDocument newDocument = PresentationDocument.Open(OutputFileText.Text,true))


    PresentationDocument templateDocument = PresentationDocument.Open(FileNameText.Text, false);


    uniqueId = GetMaxIdFromChild(newDocument.PresentationPart.Presentation.SlideMasterIdList);

    uint maxId = GetMaxIdFromChild(newDocument.PresentationPart.Presentation.SlideIdList);


    SlidePart oldPart = GetSlidePartByTagName(templateDocument, SlideToCopyText.Text);


    SlidePart newPart = newDocument.PresentationPart.AddPart<SlidePart>(oldPart, "sourceId1");


    SlideMasterPart newMasterPart = newDocument.PresentationPart.AddPart(newPart.SlideLayoutPart.SlideMasterPart);


    SlideIdList idList = newDocument.PresentationPart.Presentation.SlideIdList;


    // create new slide ID


    SlideId newId = new SlideId();

    newId.Id = maxId;

    newId.RelationshipId = "sourceId1";



    // Create new master slide ID


    SlideMasterId newMasterId = new SlideMasterId();

    newMasterId.Id = uniqueId;

    newMasterId.RelationshipId = newDocument.PresentationPart.GetIdOfPart(newMasterPart);



    // change slide layout ID







The GetMaxIDFromChild and FixSlideLayoutID methods are barrowed from Zeyad’s article.  The GetSlidePartByTagName method is listed below.  It is really one LINQ query that finds SlideParts with child Tags that have the requested Name.

private SlidePart GetSlidePartByTagName(PresentationDocument templateDocument, string tagName)


    return (from p in templateDocument.PresentationPart.SlideParts



                    <DocumentFormat.OpenXml.Presentation.Tag>().First().Name ==


            select p).First();


This is what really makes the difference from what Zeyad posted.  The most powerful thing you can have when generating documents from templates is a consistent way of naming items to be manipulated.  I will be show more approaches like this in upcoming posts.

Bolding and Underlining Text In Word Documents

February 16, 2010

In the templates that I have processed with Open XML there are usually a number of tables.  Some times we have to add an extra paragraph to a cell and we want to keep the formatting of the text already in the cell.  In this post I will go over how to apply bold and underline formatting to text as well as how to steal it from existing text and apply it to a new paragraph or run.

In order to apply an underline format to a paragraph by hand you have to start with the ParagraphProperties.  To that you append a ParagraphMarkRunProperties object to which you have appended an Underline object.  It isn’t that complicated.  There are just a lot of objects that have to be instantiated to format a paragraph.  Below is an example of what I have just described.

Paragraph newParagraph = new Paragraph();

ParagraphProperties newProperties = new ParagraphProperties();


ParagraphMarkRunProperties markRunProperties = new ParagraphMarkRunProperties();

Underline newUnderline = new Underline { Val = UnderlineValues.Single };





wordprocessing.Run newRun = new wordprocessing.Run();

wordprocessing.Text newText = new wordprocessing.Text("Text for the new paragraph");


In order to make a paragraph or run bold you append a Bold object instead of or as well as the Underline object.

If you have an existing paragraph that you can steal from the code gets a lot easier.  The code below does just that by cloning the ParagraphProperties from the existing paragraph and appending it to the new paragraph.

ParagraphProperties properties = (ParagraphProperties)oldParagraph.ParagraphProperties.Clone();


Of course which approach you take will depend on what you situation is.  If you are allowing a user to define formatting through an interface other than a template you will have to use the first example.  This is one more reason to use a base file as a template.  Either way you should be able to accomplish the goal of formatting your text.

How Does Simple Text Markup Differ Across The Office 2007 Suite

February 4, 2010


Our theme recently is things that need to be made more consistent in the office products in order to make document generation development more efficient for developers.  This time around we will focus on difference between the way text is marked up in Word and PowerPoint.

I have found that there are a number of subtle but important differences in the way text is written to the Open XML standard.  This is then reflected in the Open XML SDK’s API.

Examples of these differences are apparent in features such as text color, bolding, underlining and bulleted lists.  The main difference seems that the Word team seems to have taken a more object approach and conversely the PowerPoint team seems to favor attributes.  The result is that the PowerPoint definition ends up with more moving parts.  To illustrate this let’s take a look at the setting of text color and underlining a group of text.

Text color is handled in Word simply by applying a Color object to the run properties.  PowerPoint requires a SolidFill object and child SchemeColor and LuminanceModulation objects to get the same effect.

The differences in the way that text is bolded is very similar.  In Word the run is assigned a bold object, but PowerPoint it is a boolean attribute of the run properties.

So what is the impact to the developer.  Code reuse for those of us who have to generate both documents and presentations is next to nothing.  On top of that the learning curve is practically doubled.

I realize that the two products have evolved through separate paths and isolated team, but time for that type of development is long past.  The Open XML standard should be unified where ever possible across the Office Applications and allow for greater interaction between the products.  Ultimately the synchronizing of these tools will lead to greater adoption.