Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Thomas Kahn 602 posts 506 karma points
    Nov 24, 2009 @ 13:00
    Thomas Kahn
    0

    Cannot handle images with Swedish characters in the file name

    Hi!

    If I upload an image with a Swedish character in the file name (for example "Skärgårdsö.jpg") select the region I want to crop and click "crop" it doesn't crop the image. The path that is printed out on the screen reads "sk?rg?rds?.jpg".

    I know that it's safer to avoid such filenames, but very often users ignore this rule and images with Swedish characters work fine in browsers when I upload them the ordninary way, so I guess it's OK(?)

    I have only tested with Swedish characters, but there should be trouble with other characters as well.

    Regards,
    Thomas Kahn

  • Jannik Nilsson 38 posts 81 karma points
    Nov 24, 2009 @ 14:39
    Jannik Nilsson
    0

    I don't know if it will cause problems in IE6 fx. but ideally Umbraco would either refuse to upload files with non-standard characters or rename them.

    One of the problems with uploading files with non-standard characters I've encountered is if its done on a development server and then copied to a production server with another installed language. Then Umbraco have a hard time finding the file.

  • Thomas Kahn 602 posts 506 karma points
    Nov 26, 2009 @ 15:25
    Thomas Kahn
    0

    I've never had this problem with umbracos standard upload functions - only when I use this plugin. It would be great if it could be fixed.

    Can I find the source code anywhere so I could look into the problem myself?

    Regards,
    Thomas

  • Murray Roke 502 posts 965 karma points c-trib
    Dec 03, 2009 @ 04:17
    Murray Roke
    0

    Looks like my problem is the escape / unescape process.

    The javascript does this:

    $(imageToCrop).attr("src")

    which gives:

    /umbraco/../media/149/Blue hills ampersand & Sweedish Skärgårdsö.jpg

    Which looks ok, but if I encode it like so:

    escape($(imageToCrop).attr("src"))

    This gives:

    /umbraco/../media/149/Blue%20hills%20ampersand%20%26%20Sweedish%20Sk%E4rg%E5rds%F6.jpg
    

    and the C# does this:

    Request["src"]  

    which gives:

    /umbraco/../media/149/Blue hills ampersand & Sweedish Sk�rg�rds�.jpg

    which causes a FileNotFoundException

    Obviously something screwing up in the encoding & decoding, if you have any ideas of how to resolve this let me know.

    Note: this example also has an & in the image name which we find is quite common for users to do, which is why I'm encoding & decoding the filename.

     

    Cheers.

    Murray.

  • Thomas Höhler 1237 posts 1709 karma points MVP
    Dec 03, 2009 @ 09:10
    Thomas Höhler
    1

    I think you should never use any non-international characters in any file to show on the web. The only thing you get are problems. For instance the empty character will be replaced with %20. Theses urls aren't really readable nor SEO conform.

    I had to struggle with this problem also in my company and hacked the upload datatype to replace specified characters. My solution is up on codeplex as a patch (issue: http://umbraco.codeplex.com/WorkItem/View.aspx?WorkItemId=22714 [patch #2944])

    Vote for it and perhaps it will be included in 4.1

    Thomas

  • Thomas Kahn 602 posts 506 karma points
    Dec 03, 2009 @ 11:13
    Thomas Kahn
    0

    Thanks guys!

    Thomas Höhler is of course right - avoid any wierd characters in file names. The problem is that users show no interest in altering filenames before uploading files and they blame the CMS if things don't work like they should.

    I also understand that the problem is more generic and not a specific bug in Terabyte Image Cropper.

    I will take a look at Thomas Höhlers patch to see if I can implement it and if it solves our problem. I'll be sure to vote regardless since I believe a solution for cleaning up filenames should be default in Umbraco. :-)

    /Thomas Kahn

  • Thomas Kahn 602 posts 506 karma points
    Dec 03, 2009 @ 11:24
    Thomas Kahn
    0

    Thomas H: How do I apply this patch?

  • Thomas Höhler 1237 posts 1709 karma points MVP
    Dec 03, 2009 @ 11:32
    Thomas Höhler
    0

    Unfortunately you have to take the umbraco code, hack it with my code, recompile it and upload the new dlls + the new config entries. So you have to do this every time you are upgrading umbraco till they apply the patch to the core. Another solution ist to build your own upload datatype (base on the upload field hacked with my code) and change all the upload fields with this new upload field. Don't know if it will be included in the near future. will ask Aaron.

    Thomas

  • Murray Roke 502 posts 965 karma points c-trib
    Dec 03, 2009 @ 21:42
    Murray Roke
    0

    Well I think Thomas' solution will definately sove the problem especially if implemented with a 'white-list' rather than a black list.

    eg:

    filename = Regex.Replace(filename, @"[^a-zA-Z0-9]+","-");

    However in the meantime Tomas you may wish to add a validation rule to the cropper so users don't upload files named with problematic characters.

    This validation experssion is untested but something like this should work. (It also checks the extension because my cropper currently only outputs jpg format.)

    Regex validation is performed against a string of this format:
    IsValidSize:true|Width:200|Height:100|Length:400|Extension:jpg|Name:blah.jpg
    Therefore to validate the image does not have problem characters, use this regex :

    .*Name\:[^a-zA-Z0-9\-]+\.(?i)(jpg|jpeg)$

    Something more permissive may be better

    If you want to vaildate the image size (which you probably do) make your regex like so:

    .*IsValidSize\:true.*Name\:[^a-zA-Z0-9\-]+\.(?i)(jpg|jpeg)$

    Unfortunately the validation message will say something like "your input is not in the correct format" you will have no idea if it's the image size or the filename being referred to, which is a limitation within umbraco.

  • Peter Gregory 408 posts 1614 karma points MVP 3x admin c-trib
    Dec 03, 2009 @ 22:03
    Peter Gregory
    0

    I would also suggest in the description field that you specify that it will only accept no international characters to give some clues to back up the validation error.

  • Thomas Höhler 1237 posts 1709 karma points MVP
    Dec 04, 2009 @ 08:48
    Thomas Höhler
    0

    I did choose a black list instead of a white list because it is the same way umbraco handles url replacing for the nodes. I didn't wanted to have different behaviours in my installation. But a usage as whitelist with regex is also a great possibility. It is welcome to add one of these behaviours to the core, regardless which one.

    Thomas

  • Murray Roke 502 posts 965 karma points c-trib
    Dec 06, 2009 @ 23:07
    Murray Roke
    0

    However since my Datatype isn't in the core I can add it myself.... I'll try and add it to the Terabyte Image Cropper at some point when I get a chance, don't hold your breath I have a baby arriving within 7 days :-)

  • Thomas Höhler 1237 posts 1709 karma points MVP
    Dec 07, 2009 @ 11:01
    Thomas Höhler
    0

    No worries Murray, all the best to your family and enjoy the time...

    Thomas

Please Sign in or register to post replies

Write your reply to:

Draft