Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at

  • Jan Kees Velthoven 25 posts 46 karma points
    Oct 15, 2010 @ 12:23
    Jan Kees Velthoven

    Pdf Document Media Type

    I've just created a custom media type for PDF document files. I just wanted to get more information about a document than just the extension and the filesize. Searching the net I found a project named iTextSharp. iTextSharp is a port of the iText open source java library for PDF generation written entirely in C# for the .NET platform.

    The best solution to do this is to hook into the BeforeSave event for Media. When dealing with a umbracoFile that has the extension .pdf the iTextSharp library is invoced to get properties like the number of pages, title, pdf version, author and the table of contents.

    My solution has three parts

    1. Create a custom Media Type for a PDF Document with the extended properties
    2. Change the database datatype for the "Label" Data Type to Ntext
    3. Create an event handler for the BeforeSave event for Media

    Create a Media Type like the image below. The name of the Media Type is "PdfDocument". The properties are:

    • Upload File, umbracoFile, Upload
    • Type, umbracoExtension, Label
    • Size, umbracoBytes, Label
    • Title, umbracoPdfTitle, Label
    • Author, umbracoPdfAuthor, Label
    • Version, umbracoPdfVersion, Label
    • Number of pages, umbracoPdfNumberOfPages, Label
    • Table of contents, umbracoPdfTableOfContents, Label

    The following EventHandler is created for the BeforeSave event on Media

    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Linq;
    using System.Text;
    using System.Web;
    using iTextSharp.text.pdf;
    using umbraco.BusinessLogic;
    using umbraco.cms.businesslogic;

    namespace UmbracoWebApplication.CustomEvents
        public class PdfDocumentSaveEvent : ApplicationBase
            public PdfDocumentSaveEvent()
                Media.BeforeSave += Media_BeforeSave;

            private void Media_BeforeSave(Media sender, SaveEventArgs e)
                if (sender.ContentType.Alias == "PdfDocument")
                    //Check if we're dealing with a pdf document
                    if (Path.GetExtension(sender.getProperty("umbracoFile").Value.ToString()) == ".pdf")
                        //Get the mapped path for the pdf document
                        string pdfPath = sender.getProperty("umbracoFile").Value.ToString();
                        string mappedFilePath = HttpContext.Current.Server.MapPath(pdfPath);

                        var reader = new PdfReader(mappedFilePath);

                        //Fill the additional properties
                        sender.getProperty("umbracoPdfTitle").Value =
                            reader.Info.Where(p => p.Key == "Title").FirstOrDefault().Value;
                        sender.getProperty("umbracoPdfAuthor").Value =
                            reader.Info.Where(p => p.Key == "Author").FirstOrDefault().Value;
                        sender.getProperty("umbracoPdfVersion").Value = reader.PdfVersion;
                        sender.getProperty("umbracoPdfNumberOfPages").Value = reader.NumberOfPages;
                        sender.getProperty("umbracoPdfTableOfContent").Value = GenerateTextBookmark(reader);
                        //If the extension is not pdf we clear the custom fields
                        //This is also a solution for removing the file
                        sender.getProperty("umbracoPdfTitle").Value = String.Empty;
                        sender.getProperty("umbracoPdfAuthor").Value = String.Empty;
                        sender.getProperty("umbracoPdfVersion").Value = String.Empty;
                        sender.getProperty("umbracoPdfNumberOfPages").Value = String.Empty;
                        sender.getProperty("umbracoPdfTableOfContent").Value = String.Empty;

            private static string GenerateTextBookmark(PdfReader reader)
                IEnumerable<Dictionary<string, object>> bookmarks = SimpleBookmark.GetBookmark(reader);

                var builder = new StringBuilder();

                return GenerateBookmarkSection(bookmarks, builder);

            private static string GenerateBookmarkSection(IEnumerable<Dictionary<string, object>> section,
                                                          StringBuilder builder)
                if (section == null) return String.Empty;


                foreach (var bookmark in section)
                    object value;
                    bookmark.TryGetValue("Title", out value);


                    if (bookmark.Count == 5)
                        GenerateBookmarkSection((IEnumerable<Dictionary<string, object>>) bookmark.Values.Last(), builder);



                return builder.ToString();

    More about iTextSharp can be found on This is also a nice library to index PDF documents on the containing text.

    The result of my custom event handler is shown in the following image

    So have fun and see you in Gent (BE) on the level 2 training.

    Greetings, Jan Kees Velthoven

  • Connie DeCinko 931 posts 1159 karma points
    Jan 12, 2011 @ 17:58
    Connie DeCinko


    This looks really good.  Is it still working well for you?  What version of Umbraco?



  • AValor 32 posts 56 karma points
    Aug 09, 2011 @ 14:03

    Works great in my 4.7 installation! Thank you!!!

  • AValor 32 posts 56 karma points
    Aug 16, 2011 @ 13:03

    Based on this, I have created a Media Factory to be used in DesktopMediaUploader. You can find it here:,-feedback-and-suggestions/22949-Upload-PDF-as-'pdfDocument'-custom-media-type


Please Sign in or register to post replies

Write your reply to: