Hello guys! Lately ive been reading Lucene in action second edition, to get well suited for creating search functionality on a website im working on. Though, the site is in danish, and with that it dosn't make that much sense to have a english stopword list, i'd rather have it in danish, which i got the list for, but can't seem to figure out how to implement my extension of standardAnalyzer to examine so i can use it. Furthermore i'd like to ask you guys if you have any experience in using examine/Lucene.net and having autocomplete when searching, and "Did you mean?"- function implemented into a solution. - cuz these things are needed.
Would it make more sense to use lucene.net rather than examine to write my code, now when ive read how lucene is working and know about the syntax.
Hopefully you guys know a bit more about this subject than me, and can give me some guidence, it would be much apriciated!
With regards to autocomplete you could write something yourself so jquery autocomplete and some kind of webservice / rest / base or whatever you like end point to give you back the results. With regards to did you mean I did it ages ago on a site however was using lucene.net. In theory you could still use spellchecker contribution package. As far as I am remember (see page 279 lucene in action 2nd edition) you need the word you want to spell check and you need the directory to the index.
The word you already have and using examine to get the directory path is something like
var index = (UmbracoContentIndexer)ExamineManager.Instance.IndexProviderCollection[myIndex];
you can then get lower level access to the lucene.net objects and path to the lucene directory
This might seem like a stupid question, but after having read link you gave, i've opened global.asax, but i can't reach the .cs file for global.asax, so i can't edit where i am supposed to. How come this ?!
I install the plugins UComment, unfortunately post comments on content (PageID) then I want to post comments on a well-defined product (ProductID)
is that you can help me on this?
Regards
Anis
I sadly dont know the anything about the thing you are asking, though, i recommend you make a topic yourself, explaining the problem, then there sure will be people who can knows more than me, that can help you. Hope you find a way to figure it out! :)
I have given you the new topic link via twitter twice here it is again http://our.umbraco.org/forum/core/general/NewTopic create your question there and then you can get help. This topic is regarding a particular search issue.
ive been looking over the links you sent, and i just can't find to figure a way to create a custom stopword list. Ive tried making a class that has inheritance from IApplicationStartupHandler, and from there i can fetch the actual stoplist, but i can't set it.
public class WebsiteSearchIndexerEvents : IApplicationStartupHandler
ill check it out.. Though it is outside of the scope of this post.. Do you have any experience dealing with autocomplete functionality for examine / Lucene, ? , we would really like to get some live sugestions for the users. But ive searched the web thin the last couple of days, and im still on rock buttom. I thought that you might know a bit about it ?
For server side element to supply the data ajaxically to the client you have a number of options:
Quick and dirty, create a new template say for the home page call it autocomplete. Create a razor macro call it autocompleteajax and do the search in that write back the data in the macro as json string send back page name and page url.
I worked on it yesterday, and made some great progess in trying it out in a Console application. I used another version of Lucene, where Shingle was implemented, to create the terms for autocompletion. but when i wanted to merge it with my umbraco application, there was complications with having 2 different version of lucene at the same time. And there wasn't any easy way out of it, as far as the Seniors knew, so know i have to create a new a whole new site which will be used as a proxy of some kind, which ill call through jQuery Autocomplete, to recieve the autocomplete surgestions.
though the question was more about how i could make autocomplete with examine/lucene in the current Umbraco version. But i've figured out how :) - Shingle! , but as told earliere, its from 3.0.3.0.
Im not sure, but i wouldn't be able to use any of the 3 surgestions for you, after knowing that the app_pool would contain 2 types of lucene, and break ?
Once agian, a BIG thanks for being so patience and sharing your knowledge on this topic, you've helped me a great deal!
What is shingles you got alink for it? Also is your lucene data source a separate index outside umbraco? Also you can have 2 lucene versions one of the old ucomponents versions needed different version of lucene from default umbraco version you then have to update the web.config to do it see http://ucomponents.codeplex.com/downloads/get/447122 there is readme in that zip.
Shingles is an analyzer. for instance, from page 267 in Lucene in action. "the sentence ' please divide this sentence into shingles' might be tokenized into the shingles "please divide", "divide this", "this sentence", "sentence into", "into shingles".
As far as we found out, this would be the best approach to get autocomplete. So when a user searches for something, ill do a wildcard search on the word, and get the best term "phrase" back to the user, or maybe 4 examples to guess what the user wants. Though, the issue is, that this is based on content written by a user / editor, so the surgestions isn't that search friendly like google. At google i can ask it a question, and then get a result from that, this isn't quite the thing we achive atm. But we compensate with having a "best guess" field where the editor can write the best surgestions for searching on that specefic node.
so atm. i shingle analyze the indexed content from examine, and use these to surgest the autocomplete.
Though i couldn't get your link to work, with a quick try. though my app is quite broken atm. So ill give it a try when i get it up and running agian, as soon as ive excluded all the lucene. 3.0.3.0 code to another project.
Change examine analyzer (stopwords, stemming etc)
Hello guys!
Lately ive been reading Lucene in action second edition, to get well suited for creating search functionality on a website im working on. Though, the site is in danish, and with that it dosn't make that much sense to have a english stopword list, i'd rather have it in danish, which i got the list for, but can't seem to figure out how to implement my extension of standardAnalyzer to examine so i can use it.
Furthermore i'd like to ask you guys if you have any experience in using examine/Lucene.net and having autocomplete when searching, and "Did you mean?"- function implemented into a solution. - cuz these things are needed.
Would it make more sense to use lucene.net rather than examine to write my code, now when ive read how lucene is working and know about the syntax.
Hopefully you guys know a bit more about this subject than me, and can give me some guidence, it would be much apriciated!
Niclas,
To update stop words if using examine see http://our.umbraco.org/forum/developers/extending-umbraco/25600-Examine-case-insensitive-keyword-search
With regards to autocomplete you could write something yourself so jquery autocomplete and some kind of webservice / rest / base or whatever you like end point to give you back the results. With regards to did you mean I did it ages ago on a site however was using lucene.net. In theory you could still use spellchecker contribution package. As far as I am remember (see page 279 lucene in action 2nd edition) you need the word you want to spell check and you need the directory to the index.
The word you already have and using examine to get the directory path is something like
you can then get lower level access to the lucene.net objects and path to the lucene directory
Regards
Ismail
This might seem like a stupid question, but after having read link you gave, i've opened global.asax, but i can't reach the .cs file for global.asax, so i can't edit where i am supposed to. How come this ?!
Niclas,
If you are using the latest version of umbraco then to wire stuff up you need to see http://our.umbraco.org/wiki/reference/api-cheatsheet/using-iapplicationeventhandler-to-register-events this the preferred method.
Regards
Ismail
I install the plugins UComment, unfortunately post comments on content (PageID) then I want to post comments on a well-defined product (ProductID) is that you can help me on this? Regards Anis
Hey Ben Amar.
I sadly dont know the anything about the thing you are asking, though, i recommend you make a topic yourself, explaining the problem, then there sure will be people who can knows more than me, that can help you.
Hope you find a way to figure it out! :)
- Niclas Schumacher
Ben,
I have given you the new topic link via twitter twice here it is again http://our.umbraco.org/forum/core/general/NewTopic create your question there and then you can get help. This topic is regarding a particular search issue.
Regards
Ismail
Ismail,
ive been looking over the links you sent, and i just can't find to figure a way to create a custom stopword list. Ive tried making a class that has inheritance from IApplicationStartupHandler, and from there i can fetch the actual stoplist, but i can't set it.
public class WebsiteSearchIndexerEvents : IApplicationStartupHandler
{
public WebsiteSearchIndexerEvents()
{
/*var indexer = ExamineManager.Instance.IndexProviderCollection["WebsiteIndexer"];
indexer.GatheringNodeData += GatheringNodeDataHandler;
*/
var stopwords = new Hashtable();
stopwords.Add("de", "de");
stopwords.Add("skrevet", "skrevet");
Lucene.Net.Analysis.StopAnalyzer.ENGLISH_STOP_WORDS_SET = stopwords; // this dosn't set it for the standardAnalyzer
var orginalStopwordsList = Lucene.Net.Analysis.Standard.StandardAnalyzer.STOP_WORDS_SET;
}
- Niclas Schumacher
Nicolas,
Take a look at pg2 of http://our.umbraco.org/forum/developers/extending-umbraco/25600-Examine-case-insensitive-keyword-search you need to create Your own analayser based on standard analyser and pass in stop words also you will need to update examine config to tell it to use that analyser
Regards
Ismail
Ismail,
ill check it out..
Though it is outside of the scope of this post.. Do you have any experience dealing with autocomplete functionality for examine / Lucene, ? , we would really like to get some live sugestions for the users. But ive searched the web thin the last couple of days, and im still on rock buttom. I thought that you might know a bit about it ?
Niclas,
For the auto complete client side use http://jqueryui.com/autocomplete/
For server side element to supply the data ajaxically to the client you have a number of options:
Quick and dirty, create a new template say for the home page call it autocomplete. Create a razor macro call it autocompleteajax and do the search in that write back the data in the macro as json string send back page name and page url.
Another option is to write base extension that sends back json results after performing a search. See http://our.umbraco.org/wiki/reference/umbraco-base and http://cultiv.nl/blog/2011/7/25/razor-vs-base-to-output-json-in-umbraco/
The final option if you are using .net45 is to use umbraco webapi and create the end point that way see http://our.umbraco.org/documentation/Reference/WebApi/
Regards
Ismail
Hallo Ismail.
I worked on it yesterday, and made some great progess in trying it out in a Console application. I used another version of Lucene, where Shingle was implemented, to create the terms for autocompletion. but when i wanted to merge it with my umbraco application, there was complications with having 2 different version of lucene at the same time. And there wasn't any easy way out of it, as far as the Seniors knew, so know i have to create a new a whole new site which will be used as a proxy of some kind, which ill call through jQuery Autocomplete, to recieve the autocomplete surgestions.
though the question was more about how i could make autocomplete with examine/lucene in the current Umbraco version. But i've figured out how :) - Shingle! , but as told earliere, its from 3.0.3.0.
Im not sure, but i wouldn't be able to use any of the 3 surgestions for you, after knowing that the app_pool would contain 2 types of lucene, and break ?
Once agian, a BIG thanks for being so patience and sharing your knowledge on this topic, you've helped me a great deal!
Niclas,
What is shingles you got alink for it? Also is your lucene data source a separate index outside umbraco? Also you can have 2 lucene versions one of the old ucomponents versions needed different version of lucene from default umbraco version you then have to update the web.config to do it see http://ucomponents.codeplex.com/downloads/get/447122 there is readme in that zip.
Regards
Ismail
Shingles is an analyzer. for instance, from page 267 in Lucene in action.
"the sentence ' please divide this sentence into shingles' might be tokenized into the shingles "please divide", "divide this", "this sentence", "sentence into", "into shingles".
As far as we found out, this would be the best approach to get autocomplete. So when a user searches for something, ill do a wildcard search on the word, and get the best term "phrase" back to the user, or maybe 4 examples to guess what the user wants. Though, the issue is, that this is based on content written by a user / editor, so the surgestions isn't that search friendly like google. At google i can ask it a question, and then get a result from that, this isn't quite the thing we achive atm. But we compensate with having a "best guess" field where the editor can write the best surgestions for searching on that specefic node.
so atm. i shingle analyze the indexed content from examine, and use these to surgest the autocomplete.
before lucene.net got Shingle, there was a fella who made this, which ive used as a starting point http://codingsmith.co.za/lucene-net-phrase-suggestion/ and here is some documentaion http://lucenenet.apache.org/docs/3.0.3/d5/da5/class_lucene_1_1_net_1_1_analysis_1_1_shingle_1_1_shingle_analyzer_wrapper.html
Though i couldn't get your link to work, with a quick try. though my app is quite broken atm. So ill give it a try when i get it up and running agian, as soon as ive excluded all the lucene. 3.0.3.0 code to another project.
is working on a reply...