Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 02, 2017 @ 14:01
    Anders Brohäll
    0

    Illegal characters in path.

    When gathering node data, like when rebuilding the external index, i get a entry in the log telling me:

    System.Exception: Error indexing queue items,System.ArgumentException: Illegal characters in path.
       at System.IO.Path.CheckInvalidPathChars(String path, Boolean checkAdditional)
       at System.IO.Path.GetFileName(String path)
       at Our.Umbraco.ezSearch.ezSearchBoostrapper.OnGatheringNodeData(Object sender, IndexingNodeDataEventArgs e)
       at Examine.Providers.BaseIndexProvider.OnGatheringNodeData(IndexingNodeDataEventArgs e) in X:\Projects\Examine\Examine\src\Examine\Providers\BaseIndexProvider.cs:line 213
       at UmbracoExamine.UmbracoContentIndexer.OnGatheringNodeData(IndexingNodeDataEventArgs e)
       at Examine.LuceneEngine.Providers.LuceneIndexer.GetDataToIndex(XElement node, String type) in X:\Projects\Examine\Examine\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1115
       at Examine.LuceneEngine.Providers.LuceneIndexer.ProcessIndexQueueItem(IndexOperation op, IndexWriter writer) in X:\Projects\Examine\Examine\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1965
       at Examine.LuceneEngine.Providers.LuceneIndexer.ProcessQueueItem(IndexOperation item, ICollection`1 indexedNodes, IndexWriter writer) in X:\Projects\Examine\Examine\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1676
       at Examine.LuceneEngine.Providers.LuceneIndexer.ForceProcessQueueItems(Boolean block) in X:\Projects\Examine\Examine\src\Examine\LuceneEngine\Providers\LuceneIndexer.cs:line 1530, IndexSet: ExternalIndexSet
    

    From what i've learned the Illegal characters in path is the key, that something that my client has uploaded has a a character in the file name that Umbraco doesn't like. Or rather System.IO.Path doesn't like. Or maybe it isn't uploaded, maybe it's a node with a name that got through the requestHandler/urlReplacing handler.

    However, I can't figure out how to find the media (or node) that contains the illegal character. How would I do that?

    I tried to copy all media and the actual DB from the server to my local environment, without luck..

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 09:01
    Nick
    0

    Hey

    You could try searching the media file names in the database for any file name that meet the following criteria since they are invalid.

    • double quote (")
    • left angle bracket (<)
    • right angle bracket (>)
    • veritical bar (|)
    • control characters less than 32 decimal (space).

    To search for the urlName of a node you could try the following script:

    SELECT TOP 1 [nodeId] FROM [db_owner].[cmsContentXml] where (cast([xml] as xml).value('(/{DOCUMENTTYPE}/@urlName)[1]', 'nvarchar(max)')) like '%>%'

    The [db_owner].[cmsPropertyData] contains the names of the media images in dataNtext so you could search that column for any invalid characters.

    Not sure if any of this helps

    Nick

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 09:15
    Anders Brohäll
    0

    Nice! However, i get an error i TSQL: XQuery [value()]: ")" was expected.

    I can't really wrap my head around why!

    I'm on SQL 2016 Express.

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 09:29
    Nick
    0

    Did you remove an extra bracket from the query?

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 09:34
    Anders Brohäll
    0

    No, I had to change db_owner to dbo, but apart from that nothing is changed:

    SELECT TOP 1 [nodeId] FROM dbo.[cmsContentXml] where (cast([xml] as xml).value('(/{DOCUMENTTYPE}/@urlName)[1].', 'nvarchar(max)')) like '%>%'

    Hmm.

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 09:36
    Anders Brohäll
    0

    Umbraco 7.5.11 if that changes anything.

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 09:52
    Nick
    0

    Ah whoops, You need to replace {DOCUMENTTYPE} with your document type. I don't think there is a way to do all document types at once.

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 11:21
    Anders Brohäll
    0

    Aaaah. Of course : )

    For future reference:

    SELECT TOP 1 [nodeId] FROM dbo.[cmsContentXml] where (cast([xml] as xml).value('(/Image/@urlName)[1].', 'nvarchar(max)')) like '%>%'

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 11:36
    Anders Brohäll
    0

    Oh well. I cannot find anything weird apart from an ê. But renaming the node and file didn't do anything.

    Any other suggestions?

    I figured out how I can reproduce it in my local environment. Can i debug it somehow? VS won't break on anything that i can think of.

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 12:32
    Nick
    0

    Umm If you can find it locally, I would try renaming the broke node and deleting the "ExamineIndexes" folder inside App_Data -> TEMP. Then try rebuilding the index. All on local to see if that fixes up your index locally and then you can try on live

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 03, 2017 @ 12:34
    Anders Brohäll
    0

    I can trigger the error when rebuilding the index from Developer/Examine Managenment.

    It doesn't solve the issue, nor if i delete the indexes :/

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 12:58
    Nick
    0

    ah difficult. I think you could try using the actual source code to debug it locally since your got the live database.

    I believe this is the source code of what is called when you rebuild the index:

            try
            {
                Examine.ExamineManager.Instance.IndexProviderCollection["ExternalIndexer"].RebuildIndex();
            }
            catch (Exception ex)
            {
    
            }
    

    So if you run the above code somewhere and put a break point on the exception you should be able to find the broken node and rename that node.

  • Nick 34 posts 127 karma points
    Oct 03, 2017 @ 13:18
    Nick
    0

    ah difficult. I think you could try using the actual source code to debug it locally since your got the live database.

    I believe this is the source code of what is called when you rebuild the index:

            try
            {
                Examine.ExamineManager.Instance.IndexProviderCollection["ExternalIndexer"].RebuildIndex();
            }
            catch (Exception ex)
            {
    
            }
    

    So if you run the above code somewhere and put a break point on the exception you should be able to find the broken node and rename that node.

  • Anders Brohäll 295 posts 561 karma points c-trib
    Oct 04, 2017 @ 08:29
    Anders Brohäll
    0

    Unfortunately it seems that the CheckInvalidPathChars throws silent errors. So that RebuildIndex doesn't. Gaah.

    I need to think about something else for now. I'll return and figure it out in a day or so.

    : )

  • Nadine Fisch 159 posts 429 karma points
    Oct 31, 2018 @ 12:52
    Nadine Fisch
    0

    I adjust the script above and try to find the wrong node with following sql statement

    SELECT * ,Len(urlName) as charlength FROM (SELECT nodeId ,  [xml], (cast([xml] as xml).value('(/*/@urlName)[1]', 'nvarchar(max)')) as urlName FROM dbo.[cmsContentXml]) as tt
    WHERE urlName IS NOT NULL 
    AND (
    urlName like '%<%'
    OR urlName like '%>%'
    OR urlName like '%|%'
    OR urlName like '%"%'
    OR urlName LIKE '%' + CHAR(32) + '%'
    )
    

    But I am still not finding any nodes except of german "Umlaute" and long urlNames. Could this also affect the RebuildIndex? I recognized, that I can't execute the rebuildindex because of this error (illegal character in path?), and as a consequence some media files get lost after executing the ExamineManger. I tried to execute the Try and Catch-Block from above, but I don't receive an error message :(

    Does anyone have same difficulties?

  • Nadine Fisch 159 posts 429 karma points
    Nov 05, 2018 @ 08:31
    Nadine Fisch
    0

    Does Umbraco have difficulties with german "umlaute" in the urlName? I adjust my select-statement like this. And I just found results for "umlaute" like "ü"

    SELECT * ,Len(urlName) as charlength FROM (SELECT nodeId ,  [xml], (cast([xml] as xml).value('(/*/@urlName)[1]', 'nvarchar(max)')) as urlName FROM dbo.[cmsContentXml]) as tt
    WHERE urlName IS NOT NULL 
    AND (
    urlName like  '%[<>''"äüöÖÜÄ!@#$% |"]%'
    )
    
  • Thomas 319 posts 606 karma points c-trib
    Oct 07, 2021 @ 09:32
    Thomas
    0

    I Know it's a long time ago.. But did you find a solution ??

    Having the issue on 7.15.7..

Please Sign in or register to post replies

Write your reply to:

Draft