Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Niclas Schumacher 67 posts 87 karma points
    Aug 07, 2013 @ 13:29
    Niclas Schumacher
    0

    Change examine analyzer (stopwords, stemming etc)

    Hello guys!
    Lately ive been reading Lucene in action second edition, to get well suited for creating search functionality on a website im working on. Though, the site is in danish, and with that it dosn't make that much sense to have a  english stopword list, i'd rather have it in danish, which i got the list for, but can't seem to figure out how to implement my extension of standardAnalyzer to examine so i can use it. 
    Furthermore i'd like to ask you guys if you have any  experience in using examine/Lucene.net and having autocomplete when searching, and "Did you mean?"- function implemented into a solution. - cuz these things are needed.

    Would it make more sense to use lucene.net rather than examine to write my code, now when ive read how lucene is working and know about the syntax.

     

    Hopefully you guys know a bit more about this subject than me, and can give me some guidence, it would be much apriciated!  

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Aug 07, 2013 @ 13:49
    Ismail Mayat
    0

    Niclas,

    To update stop words if using examine see http://our.umbraco.org/forum/developers/extending-umbraco/25600-Examine-case-insensitive-keyword-search

    With regards to autocomplete you could write something yourself so jquery autocomplete and some kind of webservice / rest / base or whatever you like end point to give you back the results. With regards to did you mean I did it ages ago on a site however was using lucene.net. In theory you could still use spellchecker contribution package. As far as I am remember (see page 279 lucene in action 2nd edition) you need the word you want to spell check and you need the directory to the index.

    The word you already have and using examine to get the directory path is something like

    var index = (UmbracoContentIndexer)ExamineManager.Instance.IndexProviderCollection[myIndex];
    

    you can then get lower level access to the lucene.net objects and path to the lucene directory

    Regards

    Ismail

  • Niclas Schumacher 67 posts 87 karma points
    Aug 07, 2013 @ 14:48
    Niclas Schumacher
    0

    This might seem like a stupid question, but after having read link you gave, i've opened global.asax, but  i can't reach the .cs file for global.asax, so i can't edit where i am supposed to. How come this ?! 
     

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Aug 07, 2013 @ 14:51
    Ismail Mayat
    0

    Niclas,

    If you are using the latest version of umbraco then to wire stuff up you need to see http://our.umbraco.org/wiki/reference/api-cheatsheet/using-iapplicationeventhandler-to-register-events this the preferred method.

    Regards

    Ismail

  • BEN AMAR 10 posts 30 karma points
    Aug 07, 2013 @ 15:48
    BEN AMAR
    0

    I install the plugins UComment, unfortunately post comments on content (PageID) then I want to post comments on a well-defined product (ProductID) is that you can help me on this? Regards Anis

  • Niclas Schumacher 67 posts 87 karma points
    Aug 07, 2013 @ 15:51
    Niclas Schumacher
    0

    Hey Ben Amar.

    I sadly dont know the anything about the thing you are asking, though, i recommend you make a topic yourself, explaining the problem, then there sure will be people who can knows more than me, that can help you. 
    Hope you find a way to figure it out! :)

     

    - Niclas Schumacher

     

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Aug 07, 2013 @ 16:02
    Ismail Mayat
    0

    Ben,

    I have given you the new topic link via twitter twice here it is again http://our.umbraco.org/forum/core/general/NewTopic create your question there and then you can get help. This topic is regarding a particular search issue.

    Regards

    Ismail

  • Niclas Schumacher 67 posts 87 karma points
    Aug 08, 2013 @ 14:46
    Niclas Schumacher
    0

    Ismail,

    ive been looking over the links you sent, and i just can't find to figure a way to create a custom stopword list. Ive tried making a class that has inheritance from IApplicationStartupHandler, and from there i can fetch the actual stoplist, but i can't set it. 

    public class WebsiteSearchIndexerEvents : IApplicationStartupHandler

        {

     

            public WebsiteSearchIndexerEvents()

            {

                /*var indexer = ExamineManager.Instance.IndexProviderCollection["WebsiteIndexer"];

                indexer.GatheringNodeData += GatheringNodeDataHandler;

                */

     

                var stopwords = new Hashtable();

                stopwords.Add("de", "de");

                stopwords.Add("skrevet", "skrevet");

                Lucene.Net.Analysis.StopAnalyzer.ENGLISH_STOP_WORDS_SET = stopwords; // this dosn't set it for the standardAnalyzer

                var orginalStopwordsList = Lucene.Net.Analysis.Standard.StandardAnalyzer.STOP_WORDS_SET;

     

            }

     

    - Niclas Schumacher

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Aug 12, 2013 @ 10:25
    Ismail Mayat
    0

    Nicolas,

    Take a look at pg2 of http://our.umbraco.org/forum/developers/extending-umbraco/25600-Examine-case-insensitive-keyword-search you need to create Your own analayser based on standard analyser and pass in stop words also you will need to update examine config to tell it to use that analyser

    Regards

    Ismail

  • Niclas Schumacher 67 posts 87 karma points
    Aug 12, 2013 @ 16:01
    Niclas Schumacher
    0

    Ismail,

    ill check it out.. 
    Though it is outside of the scope of this post.. Do you have any experience dealing with autocomplete functionality for examine / Lucene, ? , we would really like to get some live sugestions for the users. But ive searched the web thin the last couple of days, and im still on rock buttom. I thought that you might know a bit about it ? 

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Aug 15, 2013 @ 10:08
    Ismail Mayat
    0

    Niclas,

    For the auto complete client side use http://jqueryui.com/autocomplete/

    For server side element to supply the data ajaxically to the client you have a number of options:

    1. Quick and dirty, create a new template say for the home page call it autocomplete. Create a razor macro call it autocompleteajax and do the search in that write back the data in the macro as json string send back page name and page url.

    2. Another option is to write base extension that sends back json results after performing a search. See http://our.umbraco.org/wiki/reference/umbraco-base and http://cultiv.nl/blog/2011/7/25/razor-vs-base-to-output-json-in-umbraco/

    3. The final option if you are using .net45 is to use umbraco webapi and create the end point that way see http://our.umbraco.org/documentation/Reference/WebApi/

    Regards

    Ismail

  • Niclas Schumacher 67 posts 87 karma points
    Aug 15, 2013 @ 10:33
    Niclas Schumacher
    0

    Hallo Ismail. 

    I worked on it yesterday, and made some great progess in trying it out in a Console application. I used another version of Lucene, where Shingle was implemented, to create the terms for autocompletion. but when i wanted to merge it with my umbraco application, there was complications with having 2 different version of lucene at the same time. And there wasn't any easy way out of it, as far as the Seniors knew, so know i have to create a new a whole new site which will be used as a proxy of some kind, which ill call through jQuery Autocomplete, to recieve the autocomplete surgestions. 

    though the question was more about how i could make autocomplete with examine/lucene in the current Umbraco version. But i've figured out how :) - Shingle! , but as told earliere, its from 3.0.3.0.

     Im not sure, but i wouldn't be able to use any of the 3 surgestions for you, after knowing that the app_pool would contain 2 types of lucene, and break ? 

    Once agian, a BIG thanks for being so patience and sharing your knowledge on this topic, you've helped me a great deal! 

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Aug 15, 2013 @ 10:40
    Ismail Mayat
    0

    Niclas,

    What is shingles you got alink for it? Also is your lucene data source a separate index outside umbraco? Also you can have 2 lucene versions one of the old ucomponents versions needed different version of lucene from default umbraco version you then have to update the web.config to do it see http://ucomponents.codeplex.com/downloads/get/447122 there is readme in that zip.

    Regards

    Ismail

  • Niclas Schumacher 67 posts 87 karma points
    Aug 15, 2013 @ 11:21
    Niclas Schumacher
    0

    Shingles is an analyzer. for instance, from page 267 in Lucene in action. 
    "the sentence ' please divide this sentence into shingles' might be tokenized into the shingles "please divide", "divide this", "this sentence", "sentence into", "into shingles".

    As far as we found out, this would be the best approach to get autocomplete. So when a user searches for something, ill do a wildcard search on the word, and get the best term  "phrase" back to the user, or maybe 4 examples to guess what the user wants. Though, the issue is, that this is based on content written by a user / editor, so the surgestions isn't that search friendly like google. At google i can ask it a question, and then get a result from that, this isn't quite the thing we achive atm. But we compensate with having a "best guess" field where the editor can write the best surgestions for searching on that specefic node.

    so atm. i shingle analyze the indexed content from examine, and use these to surgest the autocomplete. 

    before lucene.net got Shingle, there was a fella who made this, which ive used as a starting point http://codingsmith.co.za/lucene-net-phrase-suggestion/ and here is some documentaion http://lucenenet.apache.org/docs/3.0.3/d5/da5/class_lucene_1_1_net_1_1_analysis_1_1_shingle_1_1_shingle_analyzer_wrapper.html

     

    Though i couldn't get your link to work, with a quick try. though my app is quite broken atm. So ill give it a try when i get it up and running agian, as soon as ive excluded all the lucene. 3.0.3.0 code to another project. 
     


     

Please Sign in or register to post replies

Write your reply to:

Draft