Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Harry Spyrou 212 posts 604 karma points
    Mar 04, 2019 @ 12:24
    Harry Spyrou
    0

    Search through Nested Content Collection that contains Rich Text Editors for a string

    Hello,

    I have a rather weird search functionality to implement and I was thinking of Examine.

    I have nested content items that have text in them. I'd like the search functionality of the website I'm making to be crawling through all of them searching for the query string.

    I thought about doing it by field e.g.

    if (myField.Contains(query))
    

    But that's impossible as there are a bunch of fields in the module. Is there a way to go through the IEnumerable Collection of Nested Content and search for a string?

    I'm not sure how to approach this. Anyone that has done something similar?

    Thanks

  • Hendy Racher 863 posts 3849 karma points MVP 2x admin c-trib
    Mar 04, 2019 @ 13:46
    Hendy Racher
    0

    Hi Harry,

    The package Look might be able to help as it'll index Nested Content items as full Lucene Documents, and you can set a custom Text indexer for these.

  • Harry Spyrou 212 posts 604 karma points
    Mar 04, 2019 @ 13:49
    Harry Spyrou
    0

    Oh thanks a lot Hendy! When you say full Lucene Documents, do you mean a page (as in without modules?) I haven't heard the term before.

  • Hendy Racher 863 posts 3849 karma points MVP 2x admin c-trib
    Mar 04, 2019 @ 13:52
    Hendy Racher
    0

    No, the Examine Umbraco indexers create a Lucene document for each Umbraco Content, Media or Member being indexed, so Nested Content being a property on these will be indexed into the same Lucene document.

    Separating Nested Content items out into their own Lucene documents, allows us to search on them more easily.

  • Harry Spyrou 212 posts 604 karma points
    Mar 04, 2019 @ 13:53
    Harry Spyrou
    0

    Ah I see. I didn't know that. But once you have them as Lucene documents, do they have any link to the original page? Ideally I'd like to have the Url of the page they are part of.

  • Hendy Racher 863 posts 3849 karma points MVP 2x admin c-trib
    Mar 04, 2019 @ 13:59
    Hendy Racher
    100

    eg. to find all nested content items with some text:

                var results = new LookQuery("MySearcher")
                            {
                                NodeQuery = new NodeQuery()
                                {
                                    Aliases = new[] { "doctypeAliasOfNestedContent" },
                                    DetachedQuery = DetachedQuery.OnlyDetached
                                },
                                TextQuery = new TextQuery() { SearchText = "my text to search for" }
                            }
                            .Search();
    

    to get the 'host' urls for each of the found detached content items (nested content):

    var pages = results.Matches.Select(x => x.HostItem.Url);
    
  • Harry Spyrou 212 posts 604 karma points
    Mar 04, 2019 @ 14:00
    Harry Spyrou
    0

    You're a god, that's exactly what I need. I appreciate the help a lot! Cheers.

  • Hendy Racher 863 posts 3849 karma points MVP 2x admin c-trib
    Mar 04, 2019 @ 14:04
    Hendy Racher
    0

    you'll need to configure an indexer, so that the text you want indexed for each nested content item is indexed. eg.

    LookConfiguration.TextIndexer = x =>
    {
        if (x.HostItem.DocumentTypeAlias == "docTypeAliasContainingTheNestedContent")
        {
            return x.Item.GetPropertyValue<string>("rtePropertyAliasOnNestedContent");
        }
    
        return null;
    };
    
  • Harry Spyrou 212 posts 604 karma points
    Mar 04, 2019 @ 14:07
    Harry Spyrou
    0

    I have quite a few rte properties with different aliases. I can do that in the .config where the indexer is by adding rte field aliases instead of in code, right?

  • Hendy Racher 863 posts 3849 karma points MVP 2x admin c-trib
    Mar 04, 2019 @ 14:14
    Hendy Racher
    0

    You can combine all the RTEs into the single text field (assuming you just want to find the nested content item via some text without needing to know which RTE it came from)

  • Hendy Racher 863 posts 3849 karma points MVP 2x admin c-trib
    Mar 04, 2019 @ 14:12
    Hendy Racher
    0

    Sorry, I forgot to mention you'll need to use a Look indexer (which is also an Examine indexer) rather than an Examine Umbraco indexer, as it's the Look indexer that will handle detached content creating the Lucene documents for them. eg.

    add these into ExamineSettings.config:

    <add name="MyIndexer" type="Our.Umbraco.Look.LookIndexer, Our.Umbraco.Look" />
    
    <add name="MySearcher" type="Our.Umbraco.Look.LookSearcher, Our.Umbraco.Look" />
    

    and this into ExaminIndex.config:

      <IndexSet SetName="MyIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/MyIndex/" />
    
  • Harry Spyrou 212 posts 604 karma points
    Mar 04, 2019 @ 14:17
    Harry Spyrou
    0

    What about if I do it like this and then let it search its index? Or do I specifically have to set it with my C# code?

    enter image description here

  • Hendy Racher 863 posts 3849 karma points MVP 2x admin c-trib
    Mar 04, 2019 @ 14:22
    Hendy Racher
    0

    Unfortunately that feature isn't ready yet :(

  • Harry Spyrou 212 posts 604 karma points
    Mar 04, 2019 @ 14:24
    Harry Spyrou
    0

    Ah I see. You talked about combining all RTEs into a single text field. What do you mean by that?

  • Hendy Racher 863 posts 3849 karma points MVP 2x admin c-trib
    Mar 04, 2019 @ 14:34
    Hendy Racher
    0

    Hi Harry, can you describe the query you'd like to make ?

    (reason to combine all RTEs into a single field, is to be able to search for text in any of the RTEs by only searching one field)

  • Harry Spyrou 212 posts 604 karma points
    Mar 04, 2019 @ 14:40
    Harry Spyrou
    0

    It's just a searchbar, so the query is not exactly a specific one. User searches for an article on a page and then I would like to search the whole article (which is comprised of Nested Content modules that contain RTE editors with different aliases) and see if the string that they typed in is in there.

    I'm trying out your code but for some reason whatever I type comes back as empty. I've added my nested content collection aliases in the NodeQuery etc.

  • Hendy Racher 863 posts 3849 karma points MVP 2x admin c-trib
    Mar 04, 2019 @ 14:50
    Hendy Racher
    0

    (don't forget to index, by saving the node again, or triggering a full index rebuild)

    As you want to find pages from text, then it's probably a good idea to combine/munge all text related to a page into a single field - you could avoid using Look, and do this manually by hooking into the GatheringNodeData event, and pull all relevant text out the nested content items - but Look would work quite well here.

    BTW, the nested content Alias will be the document type used as the source for the nested content (rather than a property alias)

  • Harry Spyrou 212 posts 604 karma points
    Mar 04, 2019 @ 15:11
    Harry Spyrou
    0

    Well,

    Without using any of your code, if I go into Examine management and trigger an index rebuild on the look index that you gave me, it throws a stack overflow exception.

    I'm not sure how to proceed in this case.

  • Hendy Racher 863 posts 3849 karma points MVP 2x admin c-trib
    Mar 04, 2019 @ 15:14
    Hendy Racher
    0

    Doh ! any change you could raise an issue here ?

  • Harry Spyrou 212 posts 604 karma points
    Mar 04, 2019 @ 15:16
    Harry Spyrou
    0

    Sure but can't today, will do as soon as I'm free.

  • Hendy Racher 863 posts 3849 karma points MVP 2x admin c-trib
    Mar 13, 2019 @ 14:21
    Hendy Racher
    1

    Hi Harry, thanks for raising an issue - just posting here as a reference to it.

  • Harry Spyrou 212 posts 604 karma points
    Mar 04, 2019 @ 14:58
    Harry Spyrou
    0

    I think Look is exactly what's needed here. I'm not sure what's going wrong just yet and why the enumeration comes back as empty but I'm looking into it.

    I've added the Nested Content Aliases as the document type used and I can see that in the backoffice LookIndexer (I've named it that) has 33 documents in Index. Every time I try to trigger an index rebuild the application shuts down without telling me the error. It just says 'The application is in break mode'.

    That's odd, I'll look more into it.

    Edit: It's a Stack Overflow Exception

Please Sign in or register to post replies

Write your reply to:

Draft