Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • CB 2 posts 22 karma points
    Aug 20, 2009 @ 16:04
    CB
    0

    Windows/temp folder filling up with pdfbox files

    We're currently having space issues due to a large number of pdfbox files being stored in the windows temp directory on our web server.  The only reference we have to PDFBox is a pdfbox.dll sitting in the Umbraco bin folder.

    Ideally we'd like to store these files on a different drive where we can be sure that space will not be an issue.  Could you advise as to how we go about this?  Or let me know if I'm completely barking up the wrong tree?

    Thanks!

  • Chris Larson 48 posts 63 karma points
    Aug 20, 2009 @ 16:21
    Chris Larson
    0

    PDFBox doesn't ship with Umbraco as part of the default installation. The first place I would look is the sourceforge page for that product http://sourceforge.net/projects/pdfbox to see if there is some reference there. It sounds like there is a control, macro or other function that has been added in to the Umbraco installation to support this PDFBox solution that is not cleaning up resources at the end of the process, or the PDFBox solution itself creates these temporary files and does not get rid of them.

    Check with the development team responsible for your Umbraco installation there and find out what they are doing with PDFBox and if there is a way to schedule the removal of those files if it is a bug/problem with the PDFBox software. The development team may not even be aware that it is an issue right now.

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Aug 21, 2009 @ 15:30
    Ismail Mayat
    0

    CB,

    I came across this issue when i hacked umbsearch code, when you use pdfbox it creates tmp file to extract out content.  Are you using umbSearch or umbracoUtilities if so then check the code where you are doing anything with pdf box you need to ensure after you are done you close the document eg

           public string returnText(string FullPathToFile)
            {
                PDDocument doc = null;
                string res ="";
                res = getTextUsingIFilter(FullPathToFile);
                //ifilter didnt work use pdfbox
                if (res.Length == 0)
                {
                    try
                    {
    
                        doc = PDDocument.load(FullPathToFile);
                        PDFTextStripper stripper = new PDFTextStripper();
                        res = stripper.getText(doc);
                        logMessage(FullPathToFile + " indexed using pdfbox", umbraco.BusinessLogic.LogTypes.Debug);
                    }
                    catch (Exception ePdf)
                    {
                        logMessage("Error indexing pdf '" + FullPathToFile + "': " + ePdf.ToString(),umbraco.BusinessLogic.LogTypes.Error);
    
                    }
    
                    finally
                    {
                        doc.close();
                    }
                }
                return res;
            }

    I acutally almost brought a host down before I figured out the issue, they were not best pleased :=}

    Regards

    Ismail

  • CB 2 posts 22 karma points
    Aug 25, 2009 @ 16:29
    CB
    0

    Thanks for your help Chris and Ismail!

    The issue looks to be with the Lucene search used by Umbraco. Lucene uses PDFBox to index PDF docs. The indexes are built to the windows temp directory by default.

    A solution is to explicitly set the directory where the indexes are to be written by adding the below line to the web.config (in the app.settings section):

    <

     

     

    add key="Lucene.Net.lockdir" value="enter directory to write to here" />

     

    Thanks for the help again - wouldnt' have sorted that otherwise!

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Aug 25, 2009 @ 16:43
    Ismail Mayat
    0

    Chris,

    Would that not just give you the same issue except in a different directory?

    Regards

    Ismail

  • Umair 1 post 21 karma points
    Dec 29, 2009 @ 22:26
    Umair
    0

    Ismail, thanks for your post, that was exactly what I needed!

    Cheers,

    Umair

Please Sign in or register to post replies

Write your reply to:

Draft