Alternate OOXML Document Generation Approach

March 31, 2011 by

Eric White has put out a document generation example which uses XPath and Word Content Controls.  I applaud Eric for the amount of work he has done with his exploration of different ways to perform template base generation.  This is a subject that is challenging and we need as many ideas as we can get.  There are a couple of areas that I see room for improvement in this XPath design that I would like to bring up. 

The first is that Eric has chosen to put his document generation in the document itself.  I see this as a maintenance and reusability issue.  Architecturally I would prefer to have my code external to the document so that I can write and maintain it centrally in a generic fashion and tie it to a rules engine.

Another place I see that this approach falls down is that it is good for simple text replacement, but it doesn’t handle formatting, replacing images or working with charts.  This doesn’t mean that it can’t handle them, but I think it would lose the simplicity which looks to be it’s appeal.

Lastly, Content Controls are currently a Word only feature.  It would be great if we could come up with a mark-up technique that was universal to all Office document types.  Hey, we can dream, right?

Personally, I prefer a more meta data driven approach based on my experience with solutions which had output that was more marketing material quality.  That being said, but approach is an interesting idea to add to the design arsenal.  Thanks for the thoughts Eric.

Update Since Microsoft/PSC Office Open XML Case Study

December 16, 2010 by

In 2009 Microsoft released a case study about a project that we had done using the OOXML SDK 1.0 for Research Directors Inc.  Since that time Microsoft has released version 2.0 of the SDK and PSC has done significant development with it.  Below are some of the mile stones we have reached since the original case study.

At the time of the original case study two report types had been automated to output as PowerPoint presentations.  Now that the all the main products have been delivered we have added three reports with Word document outputs and five more reports with PowerPoint outputs.

One improvement we made over the original application was to create a PowerPoint Add-In which allows the users to tag a slide.  These tags along with the strongly typed SDK 2.0 allows for the code to use LINQ to easily search for slides in the template files.  This allows for a more flexible architecture base on assembling a presentation from copied slide extracted from the template.

The new library we created also enabled us to create two new Word based reports in two weeks.  The library we created abstracts the generation of the documents from the business logic and the data retrieval.  The key to this is the mark up.  Content Controls are a good method for identifying sections of a template to be modified or replaced.  Join this with the concept of all data being generically either scalar or two dimensional and the code becomes more generic.

In the end we found the OOXML SDK 2.0 to be a great tool for accelerating document generation development and creating happy clients. 

Creating New Presentations from Slide Masters Using OpenXML (Revisited)

May 4, 2010 by

Here’s an update to a prior post, Creating New Presentations from Slide Masters Using OpenXML, about creating new presentation slides from the slide master.  A reader commented about an error with layouts that had images in them and it came to my attention that my code completely ignored the images.  Here’s an update:

private static void InsertSlide(PresentationPart pPart, string layoutName, UInt32 slideId)

        {

            Slide slide = new Slide(new CommonSlideData(new ShapeTree()));

            SlidePart sPart = pPart.AddNewPart<SlidePart>();

            slide.Save(sPart);

            SlideMasterPart smPart = pPart.SlideMasterParts.First();

            SlideLayoutPart slPart = smPart.SlideLayoutParts.Single(kaark => kaark.SlideLayout.CommonSlideData.Name == layoutName);

            //Add the layout part to the new slide from the slide master

            sPart.AddPart<SlideLayoutPart>(slPart);

            sPart.Slide.CommonSlideData = (CommonSlideData)smPart.SlideLayoutParts.Single(kaark => kaark.SlideLayout.CommonSlideData.Name == layoutName).SlideLayout.CommonSlideData.Clone();

 

            using (Stream stream = slPart.GetStream())

            {

                sPart.SlideLayoutPart.FeedData(stream);

            }

 

            //UPDATED: Copy the images from the slide master layout to the new slide

            foreach (ImagePart iPart in slPart.ImageParts)

            {

                ImagePart newImagePart = sPart.AddImagePart(iPart.ContentType, slPart.GetIdOfPart(iPart));

                newImagePart.FeedData(iPart.GetStream());

            }

 

            SlideId newSlideId = pPart.Presentation.SlideIdList.AppendChild<SlideId>(new SlideId());

            newSlideId.Id = slideId;

            newSlideId.RelationshipId = pPart.GetIdOfPart(sPart);

        }

After feeding the slide layout part data into the new slide’s layout part, the code then adds the new image parts to the slide from the slide master layout.

Experience OOXML In Person

April 21, 2010 by

The Chicago Code Camp is coming up on May 1st.  I will be presenting on the essentials of document generation with OOXML.  The code I will be showing will leverage the 2.0 SDK.  Join us and come with your questions about Open XML.

http://www.chicagocodecamp.com

Dealing With Table Borders In OOXML

April 5, 2010 by

Formatting tables in a document programmatically can be a very complex task.  This is the major reason which we start our document generation projects with templates instead of building components in a document by hand.

Borders are on aspect of a table that you may want to fomat.  Borders are used to make certain content in a table stand out.  If you need to conditionally set and remove borders there is something that you need to be aware of.  Even in OOXML you have the concepts of styles, inheriting styles and overriding styles.

When Word defines a table it will reference a global style such as “TableGrid”.  This style will include the borders for the table.  Specifically the InsideHorizontalBorder and InsideVerticalBorder define the borders for the cells.  These can be overridden by the TableCellBorders collection of a particular cell.  Adding a double right border on a cell is as easy as the couple of lines of code below.

wordprocessing.TableCellBorders borders = new wordprocessing.TableCellBorders();

borders.RightBorder = new RightBorder(){Val = BorderValues.Double, Color = "000000", ThemeColor = ThemeColorValues.Text1, Size = (UInt32Value)4U, Space = (UInt32Value)0U };

cell.TableCellProperties.Append(borders);

If I want to revert back to the table’s style for cell borders I simply need to remove all children from the TableCellBorders collection.  It is like removing a class identifier from a TD tag in HTML.  The style in the parent object takes back over.

With the knowledge of how the borders work you can take the concept and apply it to other effects of styles.

Creating New Presentations from Slide Masters Using OpenXML

March 24, 2010 by

The slide master in Powerpoint provides a powerful way for end-users to easily control the appearance and layouts of a presentation.  A slide master contains a set of layouts that are subsequently used by the slides in the presentation.

A common approach to constructing  a new presentation is to have a template with slides that are then copied/merged into the new presentation.  The approach I will be demonstrating creates slides in the new presentation based off the slide master layouts in the template.  This approach still requires a template, but does not require slides to already exist. 

The Template

The template I used contained a few layouts in the slide master, each arranged with some placeholder objects.  A great benefit of layouts in the slide master is that they can be renamed through the UI.  The layout name is what will be used in the code to construct the slide deck in the new presentation.

The Code

Now the fun part.  The InsertSlide method takes a PresentationPart, layout name from the slide master, and ID for the new slide.  It creates the new Slide and adds the associated parts to the PresentationPart, copying all the required layout and common slide data from the slide master layout.

   1: private static void InsertSlide(PresentationPart pPart, string layoutName, UInt32 slideId)

   2:         {

   3:             Slide slide = new Slide(new CommonSlideData(new ShapeTree()));

   4: 

   5:             SlidePart sPart = pPart.AddNewPart<SlidePart>();

   6:             slide.Save(sPart);

   7: 

   8:             SlideMasterPart smPart = pPart.SlideMasterParts.First();

   9:             SlideLayoutPart slPart = smPart.SlideLayoutParts.Single(kaark => kaark.SlideLayout.CommonSlideData.Name == layoutName);

  10:             sPart.AddPart<SlideLayoutPart>(slPart);

  11:             sPart.Slide.CommonSlideData = (CommonSlideData)smPart.SlideLayoutParts.Single(kaark => kaark.SlideLayout.CommonSlideData.Name == layoutName).SlideLayout.CommonSlideData.Clone();

  12:             using (Stream stream = slPart.GetStream())

  13:             {

  14:                 sPart.SlideLayoutPart.FeedData(stream);

  15:             }

  16: 

  17:             SlideId newSlideId = pPart.Presentation.SlideIdList.AppendChild<SlideId>(new SlideId());

  18:             newSlideId.Id = slideId;

  19:             newSlideId.RelationshipId = pPart.GetIdOfPart(sPart);

  20:         }

Since the presentation started with only a slide master in the template and no slides, a SlideIdList must be added to the PresentationPart.  Then start adding slides, using the layout names from the slide master.  Notice that the slide IDs were started at 256, that’s not a typo.  Slide IDs must be >= 256.

   1: using (PresentationDocument pDoc = PresentationDocument.Open(newFileCopiedFromTemplate, true))

   2:             {

   3:                 PresentationPart pPart = pDoc.PresentationPart;

   4: 

   5:                 pPart.Presentation.SlideIdList = new SlideIdList();

   6:                 InsertSlide(pPart, "Layout1", 256);

   7:                 InsertSlide(pPart, "Layout3", 257);

   8:                 InsertSlide(pPart, "Layout3", 258);

   9:                 InsertSlide(pPart, "Layout2", 259);

  10:                 pPart.Presentation.Save();

  11:                 pDoc.Close();

  12:             }

Open XML SDK 2 Released

March 24, 2010 by

This post is a little late since the SDK was released about a week ago.  At PSC we have been using the Open XML SDK 2 since its earliest beta.  It is a very powerful tool for generating documents without using the Office DLLs.  It is also the main technology that I have been working with for the last six months.  I would suggest giving it a try. 

Stay tuned here.  In the near future I will be presenting at different locations on this and other document generation technologies.

Download the Open XML SDK here.

Copying A Slide From One Presentation To Another

March 20, 2010 by

There are many ways to generate a PowerPoint presentation using Open XML.  The first way is to build it by hand strictly using the SDK.  Alternately you can modify a copy of a base presentation in place.  The third approach to generate a presentation is to build a new presentation from the parts of an existing presentation by copying slides as needed.  This post will focus on the third option.

In order to make this solution a little more elegant I am going to create a VSTO add-in as I did in my previous post.  This one is going to insert Tags to identify slides instead of NonVisualDrawingProperties which I used to identify charts, tables and images.  The code itself is fairly short.

SlideNameForm dialog = new SlideNameForm();

Selection selection = Globals.ThisAddIn.Application.ActiveWindow.Selection;

 

if(dialog.ShowDialog() == DialogResult.OK)

{

    selection.SlideRange.Tags.Add(dialog.slideName,dialog.slideName);

}

Zeyad Rajabi has a good post here on combining slides from two presentations.  The example he gives is great if you are doing a straight merge.  But what if you want to use your source file as almost a supermarket where you pick and chose slides and may even insert them repeatedly?  The following code uses the tags we created in the previous step to pick a particular slide an copy it to a destination file.

using (PresentationDocument newDocument = PresentationDocument.Open(OutputFileText.Text,true))

{

    PresentationDocument templateDocument = PresentationDocument.Open(FileNameText.Text, false);

 

    uniqueId = GetMaxIdFromChild(newDocument.PresentationPart.Presentation.SlideMasterIdList);

    uint maxId = GetMaxIdFromChild(newDocument.PresentationPart.Presentation.SlideIdList);

 

    SlidePart oldPart = GetSlidePartByTagName(templateDocument, SlideToCopyText.Text);

 

    SlidePart newPart = newDocument.PresentationPart.AddPart<SlidePart>(oldPart, "sourceId1");

 

    SlideMasterPart newMasterPart = newDocument.PresentationPart.AddPart(newPart.SlideLayoutPart.SlideMasterPart);

 

    SlideIdList idList = newDocument.PresentationPart.Presentation.SlideIdList;

 

    // create new slide ID

    maxId++;

    SlideId newId = new SlideId();

    newId.Id = maxId;

    newId.RelationshipId = "sourceId1";

    idList.Append(newId);

 

    // Create new master slide ID

    uniqueId++;

    SlideMasterId newMasterId = new SlideMasterId();

    newMasterId.Id = uniqueId;

    newMasterId.RelationshipId = newDocument.PresentationPart.GetIdOfPart(newMasterPart);

    newDocument.PresentationPart.Presentation.SlideMasterIdList.Append(newMasterId);

 

    // change slide layout ID

    FixSlideLayoutIds(newDocument.PresentationPart);

 

 

    //newPart.Slide.Save();

    newDocument.PresentationPart.Presentation.Save();

}

The GetMaxIDFromChild and FixSlideLayoutID methods are barrowed from Zeyad’s article.  The GetSlidePartByTagName method is listed below.  It is really one LINQ query that finds SlideParts with child Tags that have the requested Name.

private SlidePart GetSlidePartByTagName(PresentationDocument templateDocument, string tagName)

{

    return (from p in templateDocument.PresentationPart.SlideParts

            where

                p.UserDefinedTagsParts.First().TagList.Descendants

                    <DocumentFormat.OpenXml.Presentation.Tag>().First().Name ==

                tagName.ToUpper()

            select p).First();

}

This is what really makes the difference from what Zeyad posted.  The most powerful thing you can have when generating documents from templates is a consistent way of naming items to be manipulated.  I will be show more approaches like this in upcoming posts.

Naming PowerPoint Components With A VSTO Add-In

March 11, 2010 by

Sometimes in order to work with Open XML we need a little help from other tools.  In this post I am going to describe  a fairly simple solution for marking up PowerPoint presentations so that they can be used as templates and processed using the Open XML SDK.

Add-ins are tools which it can be hard to find information on.  I am going to up the obscurity by adding a Ribbon button.  For my example I am using Visual Studio 2008 and creating a PowerPoint 2007 Add-in project.  To that add a Ribbon Visual Designer.  The new ribbon by default will show up on the Add-in tab.

Add a button to the ribbon.  Also add a WinForm to collect a new name for the object selected.  Make sure to set the OK button’s DialogResult to OK. In the ribbon button click event add the following code.

ObjectNameForm dialog = new ObjectNameForm();

Selection selection = Globals.ThisAddIn.Application.ActiveWindow.Selection;

 

dialog.objectName = selection.ShapeRange.Name;

 

if (dialog.ShowDialog() == DialogResult.OK)

{

    selection.ShapeRange.Name = dialog.objectName;

}

This code will first read the current Name attribute of the Shape object.  If the user clicks OK on the dialog it save the string value back to the same place.

Once it is done you can retrieve identify the control through Open XML via the NonVisualDisplayProperties objects.  The only problem is that this object is a child of several different classes.  This means that there isn’t just one way to retrieve the value.  Below are a couple of pieces of code to identify the container that you have named.

The first example is if you are naming placeholders in a layout slide.

foreach(var slideMasterPart in slideMasterParts)

{

    var layoutParts =  slideMasterPart.SlideLayoutParts;

    foreach(SlideLayoutPart slideLayoutPart in layoutParts)

    {

        foreach (assmPresentation.Shape shape in slideLayoutPart.SlideLayout.CommonSlideData.ShapeTree.Descendants<assmPresentation.Shape>())

        {

            var slideMasterProperties =

                from p in shape.Descendants<assmPresentation.NonVisualDrawingProperties>()

                where p.Name == TokenText.Text

                select p;

 

            if (slideMasterProperties.Count() > 0)

                tokenFound = true;

        }

    }

}

The second example allows you to find charts that you have named with the add-in.

foreach(var slidePart in slideParts)

{

    foreach(assmPresentation.Shape slideShape in slidePart.Slide.CommonSlideData.ShapeTree.Descendants<assmPresentation.Shape>())

    {

        var slideProperties = from g in slidePart.Slide.Descendants<GraphicFrame>()

            where g.NonVisualGraphicFrameProperties.NonVisualDrawingProperties.Name == TokenText.Text

            select g;

 

        if(slideProperties.Count() > 0)

        {

            tokenFound = true;

        }

    }

}

Together the combination of Open XML and VSTO add-ins make a powerful combination in creating a process for maintaining a template and generating documents from the template.

Bolding and Underlining Text In Word Documents

February 16, 2010 by

In the templates that I have processed with Open XML there are usually a number of tables.  Some times we have to add an extra paragraph to a cell and we want to keep the formatting of the text already in the cell.  In this post I will go over how to apply bold and underline formatting to text as well as how to steal it from existing text and apply it to a new paragraph or run.

In order to apply an underline format to a paragraph by hand you have to start with the ParagraphProperties.  To that you append a ParagraphMarkRunProperties object to which you have appended an Underline object.  It isn’t that complicated.  There are just a lot of objects that have to be instantiated to format a paragraph.  Below is an example of what I have just described.

Paragraph newParagraph = new Paragraph();

ParagraphProperties newProperties = new ParagraphProperties();

 

ParagraphMarkRunProperties markRunProperties = new ParagraphMarkRunProperties();

Underline newUnderline = new Underline { Val = UnderlineValues.Single };

markRunProperties.AppendChild(newUnderline);

newProperties.AppendChild(markRunProperties);

 

newParagraph.AppendChild(newProperties);

wordprocessing.Run newRun = new wordprocessing.Run();

wordprocessing.Text newText = new wordprocessing.Text("Text for the new paragraph");

newRun.AppendChild(newText);

In order to make a paragraph or run bold you append a Bold object instead of or as well as the Underline object.

If you have an existing paragraph that you can steal from the code gets a lot easier.  The code below does just that by cloning the ParagraphProperties from the existing paragraph and appending it to the new paragraph.

ParagraphProperties properties = (ParagraphProperties)oldParagraph.ParagraphProperties.Clone();

newParagraph.AppendChild(props);

Of course which approach you take will depend on what you situation is.  If you are allowing a user to define formatting through an interface other than a template you will have to use the first example.  This is one more reason to use a base file as a template.  Either way you should be able to accomplish the goal of formatting your text.