Searching by name not including some nodes in results
Hi,
I'm having issues with search results, it picks up most things but the big problem is searching for events.
Some terms display the event & others don't. See below for tests & a link.
It's hard to say without looking into the index to see what is going on. The best I can suggest is to install Luke (https://code.google.com/p/luke/) and then debug into ezSearch.cshtml to just before the query runs and copy the generated query into Luke and see what part is erroring.
If you can do that, maybe copy the lucene query here and I'll see if I can spot anything obvious.
Hi Keilo, I don't think Dougs fix would make much of a difference here. As Kev's searches are not phrase searches, ie he isn't wrapping the phrases in quotes, the fix by doug wouldn't apply.
Ultimately, ezSearch splits multi term searches not wrapped in quotes into OR tokens, so what Kev is searching for should yield results.
You have a point. As he stated others with multi terms do return results whereas certain instances it doesnt - all without quotes.
I was playing around with the ezsearch as I saw the post and recently Doug's updates. Offtopic but I was trying to figure out if I can put a simple dropdown where user can indicate search by date (most recent) or most relevant. Not sure if you have implemented something similar with it?
As ezSearch is doing a grouped or on the keywords if a keyword is also a default stop word then no results will be returned.
Currently testing this theory. I think the easiest thing to do is for ezSearch to strip these words from a keyword search (they're fine in phrase search). It is possible to specify the stop words in the Lucene analyser consturctor but this would require a change in the Umbraco core.
Hmm, I think you might be right Doug. Setup a node named "The Sound of Music" on my test install, and searching for "Sound of Music" does fail, but if I change line 63 of ezSearch.cshtml to the following:
it does indeed return results (after stripping out any stop words). I guess I need to work out if stripping stop words is a good idea or not. I think the only extra thing to do would be to check after tokenization to make sure we still have a search term to search for (ie, make sure all tokens weren't stop words) and if it's now emtpy, don't show any results (basically one more big if statement round everything).
Anyway, I think the code above should give Kev the results he needs for now though.
Doug, would it? The stop words are only stop words if they are on their own. Passing a phrase into the where statement would not match an entry in the set so wouldn't get removed and so should still go through.
I'll have to take a look at your patch a bit later as it's showing quite a lot of changes and I haven't got much time to look through it at the moment.
Searching by name not including some nodes in results
Hi,
I'm having issues with search results, it picks up most things but the big problem is searching for events. Some terms display the event & others don't. See below for tests & a link.
http://www.fernehamhall.co.uk/search
Search term tests:
Other terms work absolutely fine though e.g.
Does anyone have any suggestions on what could be the issue?
p.s. I've already rebuild the index and no luck.
Thanks in advance
Hi Kev,
It's hard to say without looking into the index to see what is going on. The best I can suggest is to install Luke (https://code.google.com/p/luke/) and then debug into ezSearch.cshtml to just before the query runs and copy the generated query into Luke and see what part is erroring.
If you can do that, maybe copy the lucene query here and I'll see if I can spot anything obvious.
Matt
Not sure if you have seen this patch to better support phrases in ezsearch (added by user DougMac)
https://github.com/DougMac/ezSearch/commit/a74644d5dbebba1772f3eebe8a4d1a669f8a32f9
Hi Keilo, I don't think Dougs fix would make much of a difference here. As Kev's searches are not phrase searches, ie he isn't wrapping the phrases in quotes, the fix by doug wouldn't apply.
Ultimately, ezSearch splits multi term searches not wrapped in quotes into OR tokens, so what Kev is searching for should yield results.
Matt
Hi Matt
You have a point. As he stated others with multi terms do return results whereas certain instances it doesnt - all without quotes.
I was playing around with the ezsearch as I saw the post and recently Doug's updates. Offtopic but I was trying to figure out if I can put a simple dropdown where user can indicate search by date (most recent) or most relevant. Not sure if you have implemented something similar with it?
cheers
I think the issue here is that the Umbraco ExternalSearcher based on the Lucene standard analyser has a default set of stop words
As ezSearch is doing a grouped or on the keywords if a keyword is also a default stop word then no results will be returned.
Currently testing this theory. I think the easiest thing to do is for ezSearch to strip these words from a keyword search (they're fine in phrase search). It is possible to specify the stop words in the Lucene analyser consturctor but this would require a change in the Umbraco core.
Hmm, I think you might be right Doug. Setup a node named "The Sound of Music" on my test install, and searching for "Sound of Music" does fail, but if I change line 63 of ezSearch.cshtml to the following:
it does indeed return results (after stripping out any stop words). I guess I need to work out if stripping stop words is a good idea or not. I think the only extra thing to do would be to check after tokenization to make sure we still have a search term to search for (ie, make sure all tokens weren't stop words) and if it's now emtpy, don't show any results (basically one more big if statement round everything).
Anyway, I think the code above should give Kev the results he needs for now though.
Matt
Worked perfectly for me!
I think that will break phrase searches that contain stop words.
I've put my changes here https://github.com/DougMac/ezSearch/compare/patch-3?expand=1. Needs a bit more testing before a pull request.
Doug, would it? The stop words are only stop words if they are on their own. Passing a phrase into the where statement would not match an entry in the set so wouldn't get removed and so should still go through.
I'll have to take a look at your patch a bit later as it's showing quite a lot of changes and I haven't got much time to look through it at the moment.
Matt
Yes you're right for some reason I was thinking the contains in the where statement was checking the opossite way.
Ignore my patch (there was a lot of changes reported due to the additional if statement but the main change was in the Tokenize method)
Hey Doug,
Cool. Yea, I've moved the where statement into the Tokenize method now so it sounds like we are pretty much inline with one and other now so all good.
If I get some time tonight, I'll pull down your pull requests and push these all out as an update.
Many thanks
Matt
Thank you so much for this fix guys!
I really appreciate the time & effort.
Kev
is working on a reply...