Its more just the fact that Lucene doesn't like 2 letter anything.
also, the word 'and' may indicate the Lucene a boolean operation. generally this term is removed from the search depending on which analyzer you are using. Which analyzer are you using to index/search ?
you should not allow people to search on words that are less than 3 chars... it will fail. or if you are searching a phrase you can surround in quotes. but again, the analyzer should strip out these words depending on which one you are using.
Its interesting. Both points appear to be valid here. if the text contains a word consisting of 2 alphabetic characters; such as "ag" it fails. As does putting the word "with" into a query. Here is an extension method I've just written which seems to do the job:
very strange... TBH I've not seen this issue before but i can replicate it. Can you log a bug for this at examine.codeplex.com ? perhaps there's a newer Lucene version that has this fixed!
This problem is still occuring if I try and Boost search terms. In my example, I'm trying to boost results where the search term matches the node name. I still get the exception:
string[] searchTerms = "the search term".Split(' ');
var provider = ExamineManager.Instance.SearchProviderCollection["WebsiteSearcher"];
ISearchCriteria searchCriteria = provider.CreateSearchCriteria(BooleanOperation.Or);
foreach (var term in searchTerms)
{
// boost the results that match the title of the page.
searchCriteria.Field("nodeName", term.Boost(_matchNodeNameBoostValue)).Or();
}
What I ended up doing is tapping into the Lucene.Net.Analysis.StopAnalyzer.ENGLISH_STOP_WORDS_SET and perform a check on each for my search words to see if they are stop words and filtering them out if they are. This seems to fix the problem:
In my meta keyword contain alphanumeric value (eg. test1).
Meta keyword field is Boost.When i am searching the term "test1" search result return nothing.any one knows the fix of this in examine search so, that user can search alphanumeric word as well.
Which analyser are you using when indexing and searching (determine by looking at your examine config files look at which index you are using)? Also can you look in the index and see if that term is there if using umbraco 6 you can use examine inspector package or if using v7 you can use the search tools provided in umbraco backoffice.
Which index are you searching on External? Also when you say you are using 2 analyzers is one for searching and the other indexing? They need to be the same for each index for indexing and searching else you will get un expected search results. Ps please do not cross post you have added a reply in another post.
I am trying to put default provider in "ExamineIndexProviders" but it's trowing error.and if i am removing default provider from "ExamineSearchProviders" it will also through the error.I am trying if any success let you know
Lucene fails when seaching on common words
Hi, does anyone know of any work-arrounds to prevent Lucene from crashing when searching on any of the common words that are ignored by Lucene ?
For example, try any of those words (an, and, are...) in the seach box above and the following error will occur:
[NullReferenceException: Object reference not set to an instance of an object.] Lucene.Net.Search.BooleanQuery.Rewrite(IndexReader reader) +312 Lucene.Net.Search.BooleanQuery.Rewrite(IndexReader reader) +319 Lucene.Net.Search.IndexSearcher.Rewrite(Query original) +24 Lucene.Net.Search.Query.Weight(Searcher searcher) +24 Lucene.Net.Search.Searcher.CreateWeight(Query query) +11 Lucene.Net.Search.Searcher.Search(Query query, Filter filter, Int32 n, Sort sort) +15 Examine.LuceneEngine.SearchResults.DoSearch(Query query, IEnumerable`1 sortField) +191 Examine.LuceneEngine.SearchResults..ctor(Query query, IEnumerable`1 sortField, IndexSearcher searcher) +82 Examine.LuceneEngine.Providers.LuceneSearcher.Search(ISearchCriteria searchParams) +104
TIA,
Hendy
am considering stripping out known words from the search before executing the query, but a more robust solution would be better :)
Its more just the fact that Lucene doesn't like 2 letter anything.
also, the word 'and' may indicate the Lucene a boolean operation. generally this term is removed from the search depending on which analyzer you are using. Which analyzer are you using to index/search ?
you should not allow people to search on words that are less than 3 chars... it will fail. or if you are searching a phrase you can surround in quotes. but again, the analyzer should strip out these words depending on which one you are using.
Its interesting. Both points appear to be valid here. if the text contains a word consisting of 2 alphabetic characters; such as "ag" it fails. As does putting the word "with" into a query. Here is an extension method I've just written which seems to do the job:
Hi Shannon,
Thanks for the suggestion, would be a quick fix to strip out all words < 3 chars, but there are other words that cause Lucene to fall over like :
We are using the Lucene.Net.Analysis.Standard.StandardAnalyzer
Thanks,
Hendy
very strange... TBH I've not seen this issue before but i can replicate it. Can you log a bug for this at examine.codeplex.com ? perhaps there's a newer Lucene version that has this fixed!
Doh!, just realized the stack trace ends with:
Examine.LuceneEngine.SearchCriteria.LuceneSearchCriteria.GetFieldInternalQuery(String fieldName, IExamineValue fieldValue, Boolean useQueryParser) +1583
so might just be something weird going on in the examine codebase. If you could log a bug, that'd be fantastic.
@Richard - thanks for that extension method :) looks like that's the quick fix to ensure search doesn't bomb.
@Shannon - sure, I'll log a bug now.
I've fixed this with the latest changset:
http://examine.codeplex.com/workitem/10326
Hopefully will get a new version out in the coming weeks.
the beta is released:
http://examine.codeplex.com/releases/view/67118
please test if you can.
This problem is still occuring if I try and Boost search terms. In my example, I'm trying to boost results where the search term matches the node name. I still get the exception:
What I ended up doing is tapping into the Lucene.Net.Analysis.StopAnalyzer.ENGLISH_STOP_WORDS_SET and perform a check on each for my search words to see if they are stop words and filtering them out if they are. This seems to fix the problem:
Hope this helps any one in a similar predicament
Hi All,
In my meta keyword contain alphanumeric value (eg. test1).
Meta keyword field is Boost.When i am searching the term "test1" search result return nothing.any one knows the fix of this in examine search so, that user can search alphanumeric word as well.
Regards,
Pushpendra singh
Pushpendra,
Which analyser are you using when indexing and searching (determine by looking at your examine config files look at which index you are using)? Also can you look in the index and see if that term is there if using umbraco 6 you can use examine inspector package or if using v7 you can use the search tools provided in umbraco backoffice.
Regards
Ismail
Ismail,
I am using two analyzer WhitespaceAnalyzer as well as StandardAnalyzer in my Exmine setting config.
My field is present in exmineindex.config.Problem only for alphanumeric not in alphabets.
My umbraco version is 4.11.8.
Regards,
Pushpendra singh
Which index are you searching on External? Also when you say you are using 2 analyzers is one for searching and the other indexing? They need to be the same for each index for indexing and searching else you will get un expected search results. Ps please do not cross post you have added a reply in another post.
Ismail,
Sorry for cross post could you please delete i am unble to delete as we know issue in deleting post from forum.
I am searching in custom index set (MySiteSearchIndexSet) which is for my site not internal or external :
Ismail,
I am trying to put default provider in "ExamineIndexProviders" but it's trowing error.and if i am removing default provider from "ExamineSearchProviders" it will also through the error.I am trying if any success let you know
Regards,
Pushpendra
is working on a reply...