Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 02, 2017 @ 14:01
    Anders Brohäll
    0

    Illegal characters in path.

    When gathering node data, like when rebuilding the external index, i get a entry in the log telling me:

    System.Exception: Error indexing queue items,System.ArgumentException: Illegal characters in path.
       at System.IO.Path.CheckInvalidPathChars(String path, Boolean checkAdditional)
       at System.IO.Path.GetFileName(String path)
       at Our.Umbraco.ezSearch.ezSearchBoostrapper.OnGatheringNodeData(Object sender, IndexingNodeDataEventArgs e)
       at Examine.Providers.BaseIndexProvider.OnGatheringNodeData(IndexingNodeDataEventArgs e) in X:\Projects\Examine\Examine\src\Examine\Providers\BaseIndexProvider.cs:line 213
       at UmbracoExamine.UmbracoContentIndexer.OnGatheringNodeData(IndexingNodeDataEventArgs e)
       at Examine.LuceneEngine.Providers.LuceneIndexer.GetDataToIndex(XElement node, String type) in X:\Projects\Examine\Examine\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1115
       at Examine.LuceneEngine.Providers.LuceneIndexer.ProcessIndexQueueItem(IndexOperation op, IndexWriter writer) in X:\Projects\Examine\Examine\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1965
       at Examine.LuceneEngine.Providers.LuceneIndexer.ProcessQueueItem(IndexOperation item, ICollection`1 indexedNodes, IndexWriter writer) in X:\Projects\Examine\Examine\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1676
       at Examine.LuceneEngine.Providers.LuceneIndexer.ForceProcessQueueItems(Boolean block) in X:\Projects\Examine\Examine\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1530, IndexSet: ExternalIndexSet
    

    From what i've learned the Illegal characters in path is the key, that something that my client has uploaded has a a character in the file name that Umbraco doesn't like. Or rather System.IO.Path doesn't like. Or maybe it isn't uploaded, maybe it's a node with a name that got through the requestHandler/urlReplacing handler.

    However, I can't figure out how to find the media (or node) that contains the illegal character. How would I do that?

    I tried to copy all media and the actual DB from the server to my local environment, without luck..

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 09:01
    Nick
    0

    Hey

    You could try searching the media file names in the database for any file name that meet the following criteria since they are invalid.

    • double quote (")
    • left angle bracket (<)
    • right angle bracket (>)
    • veritical bar (|)
    • control characters less than 32 decimal (space).

    To search for the urlName of a node you could try the following script:

    SELECT TOP 1 [nodeId] FROM [db_owner].[cmsContentXml] where (cast([xml] as xml).value('(/{DOCUMENTTYPE}/@urlName)[1]', 'nvarchar(max)')) like '%>%'

    The [db_owner].[cmsPropertyData] contains the names of the media images in dataNtext so you could search that column for any invalid characters.

    Not sure if any of this helps

    Nick

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 09:15
    Anders Brohäll
    0

    Nice! However, i get an error i TSQL: XQuery [value()]: ")" was expected.

    I can't really wrap my head around why!

    I'm on SQL 2016 Express.

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 09:29
    Nick
    0

    Did you remove an extra bracket from the query?

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 09:34
    Anders Brohäll
    0

    No, I had to change db_owner to dbo, but apart from that nothing is changed:

    SELECT TOP 1 [nodeId] FROM dbo.[cmsContentXml] where (cast([xml] as xml).value('(/{DOCUMENTTYPE}/@urlName)[1].', 'nvarchar(max)')) like '%>%'

    Hmm.

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 09:36
    Anders Brohäll
    0

    Umbraco 7.5.11 if that changes anything.

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 09:52
    Nick
    0

    Ah whoops, You need to replace {DOCUMENTTYPE} with your document type. I don't think there is a way to do all document types at once.

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 11:21
    Anders Brohäll
    0

    Aaaah. Of course : )

    For future reference:

    SELECT TOP 1 [nodeId] FROM dbo.[cmsContentXml] where (cast([xml] as xml).value('(/Image/@urlName)[1].', 'nvarchar(max)')) like '%>%'

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 11:36
    Anders Brohäll
    0

    Oh well. I cannot find anything weird apart from an ê. But renaming the node and file didn't do anything.

    Any other suggestions?

    I figured out how I can reproduce it in my local environment. Can i debug it somehow? VS won't break on anything that i can think of.

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 12:32
    Nick
    0

    Umm If you can find it locally, I would try renaming the broke node and deleting the "ExamineIndexes" folder inside App_Data -> TEMP. Then try rebuilding the index. All on local to see if that fixes up your index locally and then you can try on live

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 12:34
    Anders Brohäll
    0

    I can trigger the error when rebuilding the index from Developer/Examine Managenment.

    It doesn't solve the issue, nor if i delete the indexes :/

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 12:58
    Nick
    0

    ah difficult. I think you could try using the actual source code to debug it locally since your got the live database.

    I believe this is the source code of what is called when you rebuild the index:

            try
            {
                Examine.ExamineManager.Instance.IndexProviderCollection["ExternalIndexer"].RebuildIndex();
            }
            catch (Exception ex)
            {
    
            }
    

    So if you run the above code somewhere and put a break point on the exception you should be able to find the broken node and rename that node.

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 13:18
    Nick
    0

    ah difficult. I think you could try using the actual source code to debug it locally since your got the live database.

    I believe this is the source code of what is called when you rebuild the index:

            try
            {
                Examine.ExamineManager.Instance.IndexProviderCollection["ExternalIndexer"].RebuildIndex();
            }
            catch (Exception ex)
            {
    
            }
    

    So if you run the above code somewhere and put a break point on the exception you should be able to find the broken node and rename that node.

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 04, 2017 @ 08:29
    Anders Brohäll
    0

    Unfortunately it seems that the CheckInvalidPathChars throws silent errors. So that RebuildIndex doesn't. Gaah.

    I need to think about something else for now. I'll return and figure it out in a day or so.

    : )

  • Nadine Fisch 159 posts 429 karma points
    Oct 31, 2018 @ 12:52
    Nadine Fisch
    0

    I adjust the script above and try to find the wrong node with following sql statement

    SELECT * ,Len(urlName) as charlength FROM (SELECT nodeId ,  [xml], (cast([xml] as xml).value('(/*/@urlName)[1]', 'nvarchar(max)')) as urlName FROM dbo.[cmsContentXml]) as tt
    WHERE urlName IS NOT NULL 
    AND (
    urlName like '%<%'
    OR urlName like '%>%'
    OR urlName like '%|%'
    OR urlName like '%"%'
    OR urlName LIKE '%' + CHAR(32) + '%'
    )
    

    But I am still not finding any nodes except of german "Umlaute" and long urlNames. Could this also affect the RebuildIndex? I recognized, that I can't execute the rebuildindex because of this error (illegal character in path?), and as a consequence some media files get lost after executing the ExamineManger. I tried to execute the Try and Catch-Block from above, but I don't receive an error message :(

    Does anyone have same difficulties?

  • Nadine Fisch 159 posts 429 karma points
    Nov 05, 2018 @ 08:31
    Nadine Fisch
    0

    Does Umbraco have difficulties with german "umlaute" in the urlName? I adjust my select-statement like this. And I just found results for "umlaute" like "ü"

    SELECT * ,Len(urlName) as charlength FROM (SELECT nodeId ,  [xml], (cast([xml] as xml).value('(/*/@urlName)[1]', 'nvarchar(max)')) as urlName FROM dbo.[cmsContentXml]) as tt
    WHERE urlName IS NOT NULL 
    AND (
    urlName like  '%[<>''"äüöÖÜÄ!@#$% |"]%'
    )
    
  • Thomas 319 posts 606 karma points c-trib
    Oct 07, 2021 @ 09:32
    Thomas
    0

    I Know it's a long time ago.. But did you find a solution ??

    Having the issue on 7.15.7..

  • This forum is in read-only mode while we transition to the new forum.

    You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Please Sign in or register to post replies