windowstemp folder filling up with pdfbox files - API Questions

Press Ctrl / CMD + C to copy this to your clipboard.

Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at

CB 2 posts 22 karma points

Aug 20, 2009 @ 16:04

0

Windows/temp folder filling up with pdfbox files

We're currently having space issues due to a large number of pdfbox files being stored in the windows temp directory on our web server. The only reference we have to PDFBox is a pdfbox.dll sitting in the Umbraco bin folder.

Ideally we'd like to store these files on a different drive where we can be sure that space will not be an issue. Could you advise as to how we go about this? Or let me know if I'm completely barking up the wrong tree?

Thanks!

Copy Link
Chris Larson 48 posts 63 karma points

Aug 20, 2009 @ 16:21

0

PDFBox doesn't ship with Umbraco as part of the default installation. The first place I would look is the sourceforge page for that product http://sourceforge.net/projects/pdfbox to see if there is some reference there. It sounds like there is a control, macro or other function that has been added in to the Umbraco installation to support this PDFBox solution that is not cleaning up resources at the end of the process, or the PDFBox solution itself creates these temporary files and does not get rid of them.

Check with the development team responsible for your Umbraco installation there and find out what they are doing with PDFBox and if there is a way to schedule the removal of those files if it is a bug/problem with the PDFBox software. The development team may not even be aware that it is an issue right now.

Copy Link

Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib

Aug 21, 2009 @ 15:30

CB,

I came across this issue when i hacked umbsearch code, when you use pdfbox it creates tmp file to extract out content. Are you using umbSearch or umbracoUtilities if so then check the code where you are doing anything with pdf box you need to ensure after you are done you close the document eg

       public string returnText(string FullPathToFile)
        {
            PDDocument doc = null;
            string res ="";
            res = getTextUsingIFilter(FullPathToFile);
            //ifilter didnt work use pdfbox
            if (res.Length == 0)
            {
                try
                {

                    doc = PDDocument.load(FullPathToFile);
                    PDFTextStripper stripper = new PDFTextStripper();
                    res = stripper.getText(doc);
                    logMessage(FullPathToFile + " indexed using pdfbox", umbraco.BusinessLogic.LogTypes.Debug);
                }
                catch (Exception ePdf)
                {
                    logMessage("Error indexing pdf '" + FullPathToFile + "': " + ePdf.ToString(),umbraco.BusinessLogic.LogTypes.Error);

                }

                finally
                {
                    doc.close();
                }
            }
            return res;
        }

I acutally almost brought a host down before I figured out the issue, they were not best pleased :=}

Regards

Ismail

Copy Link

CB 2 posts 22 karma points

Aug 25, 2009 @ 16:29

0

Thanks for your help Chris and Ismail!

The issue looks to be with the Lucene search used by Umbraco. Lucene uses PDFBox to index PDF docs. The indexes are built to the windows temp directory by default.

A solution is to explicitly set the directory where the indexes are to be written by adding the below line to the web.config (in the app.settings section):

<

add key="Lucene.Net.lockdir" value="enter directory to write to here" />

Thanks for the help again - wouldnt' have sorted that otherwise!

Copy Link
Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib

Aug 25, 2009 @ 16:43

0

Chris,

Would that not just give you the same issue except in a different directory?

Regards

Ismail

Copy Link
Umair 1 post 21 karma points

Dec 29, 2009 @ 22:26

0

Ismail, thanks for your post, that was exactly what I needed!

Cheers,

Umair

Copy Link
is working on a reply...

Please Sign in or register to post replies

Flag this post as spam?

Windows/temp folder filling up with pdfbox files