Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Connie DeCinko 931 posts 1160 karma points
    Sep 22, 2015 @ 22:26
    Connie DeCinko
    0

    Examine Search Fails for two character words

    We use Examine to index a database with rule definitions. Recently I became aware that for a basic search phrase, such as "admission motion" works fine, but as soon as I add a two character word in the middle such as "admission on motion", we get a NULL reference error:

    [NullReferenceException: Object reference not set to an instance of an object.]
       Lucene.Net.Search.BooleanQuery.Rewrite(IndexReader reader) +353
       Lucene.Net.Search.BooleanQuery.Rewrite(IndexReader reader) +367
       Lucene.Net.Search.IndexSearcher.Rewrite(Query original) +29
       Lucene.Net.Search.Query.Weight(Searcher searcher) +52
       Lucene.Net.Search.Searcher.Search(Query query, Filter filter, Int32 n, Sort sort) +29
       Examine.LuceneEngine.SearchResults.DoSearch(Query query, IEnumerable`1 sortField) +258
       Examine.LuceneEngine.Providers.BaseLuceneSearcher.Search(ISearchCriteria searchParams) +181
       Ethics.DAL.EthicsDB.getSearchedOpinions(String searchPhrase) +3109
       Ethics.Opinions.Page_Load(Object sender, EventArgs e) +682
    

    At first look, I cannot see anything wrong with my code. Thoughts?

        public static DataTable getSearchedOpinions(string searchPhrase)
        {
            searchPhrase = Regex.Replace(searchPhrase, @"[^a-zA-Z0-9'.\s/\*""]{1,40}", String.Empty);
            searchPhrase = searchPhrase.Replace(" & ", " & ");
            searchPhrase = searchPhrase.Replace(" and ", " ");
            searchPhrase = Regex.Replace(searchPhrase, @"\s+", " ");
            string[] splitSearchPhrase = Utilities.Split(searchPhrase, " ", "\"");
    
            DataTable dt = new DataTable();
            dt.Columns.Add("opinion_ID", typeof(Int32));
            dt.Columns.Add("title", typeof(string));
            dt.Columns.Add("summary", typeof(string));
            dt.Columns.Add("bodytext", typeof(string));
            dt.Columns.Add("opinionDate", typeof(DateTime));
            dt.Columns.Add("smallOpinionDate", typeof(string));
            dt.DefaultView.Sort = "opinion_ID DESC";
    
            Examine.SearchCriteria.IBooleanOperation filter = null;
    
            int i = 0;
    
            var criteria2 = ExamineManager.Instance
            .SearchProviderCollection["EthicsOpinionsSearcher"]
            .CreateSearchCriteria();
    
            filter = null;
    
            i = 0;
            for (i = 0; i < splitSearchPhrase.Length; i++)
            {
                if (filter == null)
                {
                    filter = criteria2
                    .GroupedOr(new string[] { "opinion_Num", "title", "summary", "bodytext", "opiniondate", "smallOpinionDate" }, splitSearchPhrase[i].Escape())
                    .Or()
                    .GroupedOr(new string[] { "title", "summary", "bodytext" }, splitSearchPhrase[i].Fuzzy(0.7f));
                }
                else
                {
                    filter = filter
                    .And()
                    .GroupedOr(new string[] { "title", "summary", "bodytext", "opiniondate" }, splitSearchPhrase[i].Escape())
                    .Or()
                    .GroupedOr(new string[] { "title", "summary", "bodytext" }, splitSearchPhrase[i].Fuzzy(0.7f));
                }
            }
            ISearchResults SearchResults2 = ExamineManager.Instance.SearchProviderCollection["EthicsOpinionsSearcher"].Search(filter.Compile());
    
            foreach (var row in SearchResults2)
            {
                DataRow dr = dt.NewRow();
                dr["opinion_ID"] = row.Fields["opinion_ID"];
                dr["title"] = row.Fields["title"];
                dr["summary"] = row.Fields["summary"];
                dr["bodytext"] = row.Fields["bodytext"];
                dr["opinionDate"] = row.Fields["opinionDate"];
                dr["smallOpinionDate"] = row.Fields["smallOpinionDate"];
                dt.Rows.Add(dr);
            }
    
            return dt;
        }
    
  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Sep 23, 2015 @ 09:24
    Ismail Mayat
    0

    Connie,

    Which analyser are you using to create the index? Your Examine config file will have the definition. If using standard then all english stop words will be removed.

    If you are getting a null reference error then this may be something todo with one of the search results not having a field value so in your loop where you fill the datatable its erroring. I would step through the code and see if that is the case.

    Also I would get a copy of the generated lucene query and run it in luke and see what you get back this may also indicate what the problem is.

    Regards

    Ismail

  • Connie DeCinko 931 posts 1160 karma points
    Sep 23, 2015 @ 20:38
    Connie DeCinko
    0

    It looks like I am not specifying which analyser so I must be using a default one?

      <add name="EthicsOpinionsSearcher"
           type="Examine.LuceneEngine.Providers.LuceneSearcher, Examine" />
    

    The error is occurring when the filter is passed, even before getting to adding the results to the table.

  • Connie DeCinko 931 posts 1160 karma points
    Sep 23, 2015 @ 21:02
    Connie DeCinko
    0

    It seems to be narrowed down to one line...

    .GroupedOr(new string[] { "opinion_Num", "title", "summary", "bodytext", "opiniondate", "smallOpinionDate" }, splitSearchPhrase[i].Escape())
    

    of the else portion of...

            for (i = 0; i < splitSearchPhrase.Length; i++)
            {
                if (filter == null)
                {
                    filter = criteria2
                    .GroupedOr(new string[] { "opinion_Num", "title", "summary", "bodytext", "opiniondate", "smallOpinionDate" }, splitSearchPhrase[i].Escape())
                    .Or()
                    .GroupedOr(new string[] { "title", "summary", "bodytext" }, splitSearchPhrase[i].Fuzzy(0.7f));
                }
                else
                {
                    filter = filter
                    .And()
                    .GroupedOr(new string[] { "opinion_Num", "title", "summary", "bodytext", "opiniondate", "smallOpinionDate" }, splitSearchPhrase[i].Escape())
                    .Or()
                    .GroupedOr(new string[] { "title", "summary", "bodytext" }, splitSearchPhrase[i].Fuzzy(0.7f));
                }
            }
    

    That line fails on the noise words like for, and, on, of, etc.

  • Squazz 35 posts 111 karma points
    Sep 24, 2015 @ 13:05
    Squazz
    0

    I had this problem myself. Try looking at your lucene query, you might be able to find entries like "()" at some places.

    In my scenario, when Lucene found a "()" it died

    These empty parenthesis occured for me when the standard analyzer was working with words like "for, and, on, of, etc." (the stopwords in the Standard Analyzer).

    In Global.asax try pasting in the following code in your OnApplicationStarted() method.

    Lucene.Net.Analysis.StopAnalyzer.ENGLISH_STOP_WORDS_SET = new System.Collections.Hashtable();
    

    It removes the stopwords, and was one of the keys for solving my problem back then.

    All that being said. If you could provide the Lucene query it would help us a lot helping you ;)

Please Sign in or register to post replies

Write your reply to:

Draft