Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Tom 711 posts 948 karma points
    Sep 18, 2017 @ 07:11
    Tom
    0

    Examine Add Custom Grid Plugins to Examine Index

    I've gone down the Skybrud route of creating custom converters to index content in an examine index for the purposes of creating a combined search index with normal + weighted content i.e. titles etc.

    i.e.: value = GridControlInfoBlockCollectionValue.Parse(control, token as JObject);

    Where GridControlInfoBlockCollectionValue indexes the Heading and then a json collection of items each with their own titles and content etc:

    [JsonProperty("heading")]
        public string Heading { get; protected set; }
    
        [JsonProperty("items")]
        public IEnumerable<GridControlInfoBlockValue> Items { get; protected set; }
    

    I was wondering if anyone had done anything in this area parsing the plugin html templates or the like to index custom grid plugins in a more generic way?

    Cheers. Tom

  • Ismail Mayat 4335 posts 9340 karma points MVP 2x admin c-trib
    Sep 18, 2017 @ 10:29
    Ismail Mayat
    0

    Tom,

    I did some dirty screen scraping using Htmlagility pack on a site where we had loads of grid content. Also the grid elements had macros that rendered content so went for quick dirty way of indexing.

    This involved using GatheringNode data event and for the current page making a request and loading page content into html agility pack then using xpath getting the grid content and pushing the content of that node minus html into the index.

    One advantage of this is if grid changes in future I do not have to worry about it.

    Regards

    Ismail

  • Tom 711 posts 948 karma points
    Sep 18, 2017 @ 21:08
    Tom
    0

    Hi Ismail, Sounds like a decent way to go in our case too given macros etc..

    Were there any gotchas I should be aware of from your time in the trenches?

    Did you end up using umbracoHelper.RenderTemplate? or did you stick with the scraping approach to deal with the macros etc?

    Also with the scraping did you experience Lucene.Net.Store.AlreadyClosedException: this IndexReader is closed exceptions?

    Is there a way to lock and release or something similar because of the longer running operation of scraping?

    Thanks for the reply :)

  • Ismail Mayat 4335 posts 9340 karma points MVP 2x admin c-trib
    Sep 19, 2017 @ 07:48
    Ismail Mayat
    0

    Tom,

    The site i was working on was multilingual so there was a bit of messing around getting the correct url to scrape. Overall all worked fine. If you have a lot of content and you do full rebuild it may slow things down a bit however on the site I have we have not had to do a full rebuild you just publish a page and gathering node kicks in.

    I have it as an exercise on the examine course under indexing complex data types. I also posted code here https://our.umbraco.org/forum/extending-umbraco-and-using-the-api/76413-examine-indexing-rich-and-complex-property-editors

    Regards

    Ismail

  • Tom 711 posts 948 karma points
    Sep 19, 2017 @ 22:57
    Tom
    0

    Thanks Ismail, I implemented something very similar and had to remove conditional comments etc to get some things cleaned up.

    The main issue I've had since implementing in 7.6.5 is the indexreader closed exception if the scrape takes too long..

    I got around it by getting the scraper to send a header and then checking for that header and removing excess calls to tracking tags and scripts etc but that error still crops up intermittently

    And is your ExamineHelper in that example a singleton or static class of some type?

    Thanks again for your time

Please Sign in or register to post replies

Write your reply to:

Draft