I've set up a site search on my v6.1.6 istallation, it works and returns some results. The next step is to display them properly with the link and some text excerpt with the search term highlighted.
Hi Ismail, thanks for your reply, here are some of the build errors I can't figure out:
on this line:
protected Dictionary QueryParsers = new Dictionary();
Compiler complains Using the generic type 'System.Collections.Generic.Dictionary<TKey,TValue>' requires 2 type arguments Is it not a generic Dictionary and am I missing something? I do have all usings as in the example.
On this line:
var scorer = new QueryScorer(query.Rewrite(searcher.GetIndexReader()));
I get:
The type 'Lucene.Net.Search.Query' is defined in an assembly that is not referenced. You must add a reference to assembly 'Lucene.Net, Version=2.0.0.4, Culture=neutral, PublicKeyToken=null'.
The best overloaded method match for Lucene.Net.Highlight.QueryScorer.QueryScorer(Lucene.Net.Search.Query)' has some invalid arguments
I do have a reference to Lucene.Net, which comes with Umbraco, do I need to replace it? I also added the reference to Lucene.Net.Highlight which come from the zip file on that page.
There are other similar errors, so I'm very curious why wouldn't it work for me?
This was put together a while back by darren ferguson. Since then in later vesrions of umbraco we are using lucene 2.9.2 which is what i guess you are referencing? Let me check my site where i use it as its umbraco 4.11.7 with later vesrion of lucene.
using System.Collections.Generic;
using System.IO;
using System.Linq;
using Examine.LuceneEngine.SearchCriteria;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Highlight;
using Lucene.Net.Index;
using Lucene.Net.QueryParsers;
using Lucene.Net.Search;
using Version = Lucene.Net.Util.Version;
namespace Domain.Helpers
{
public class LuceneHighlightHelper
{
private readonly Version luceneVersion = Version.LUCENE29;
protected Dictionary<string, QueryParser> QueryParsers = new Dictionary<string, QueryParser>();
public string Separator { get; set; }
public int MaxNumHighlights { get; set; }
public Formatter HighlightFormatter { get; set; }
public Analyzer HighlightAnalyzer { get; set; }
private static readonly LuceneHighlightHelper _instance = new LuceneHighlightHelper();
public static LuceneHighlightHelper Instance
{
get { return _instance; }
}
private LuceneHighlightHelper()
{
Separator = "...";
MaxNumHighlights = 5;
HighlightAnalyzer = new StandardAnalyzer(_luceneVersion);
HighlightFormatter = new SimpleHTMLFormatter("<em><strong>", "</strong></em> ");
}
/// <summary>
///
/// </summary>
/// <param name="indexField">field containing searchable content</param>
/// <param name="searchQuery">query in lucene format</param>
/// <param name="highlightField">field to highlight</param>
/// <returns>highlighted search result</returns>
public string GetHighlight(string indexField, string searchQuery, string highlightField, IndexSearcher searcher)
{
string hightlightText = string.Empty;
var highlighter = new Highlighter(HighlightFormatter, FragmentScorer(searchQuery, highlightField, searcher));
var tokenStream = new StandardAnalyzer(_luceneVersion).TokenStream(highlightField, new StringReader(indexField));
string tmp = highlighter.GetBestFragments(tokenStream, indexField, 3, "...");
if (tmp.Length > 0)
hightlightText = tmp + "...";
return hightlightText;
}
/// <summary>
/// scores search results fragment atg
/// </summary>
/// <param name="searchQuery">query in lucene format</param>
/// <param name="highlightField">field to highlight</param>
/// <param name="Collection">lucene index</param>
/// <returns></returns>
private QueryScorer FragmentScorer(string searchQuery, string highlightField, IndexSearcher searcher)
{
Query query =
GetLuceneQueryObject(searchQuery, highlightField).Rewrite(searcher.GetIndexReader());
return new QueryScorer(query);
}
/// <summary>
/// gets query object to be used by highlight atg
/// </summary>
/// <param name="q">query string</param>
/// <param name="field">field to query on</param>
/// <returns>lucene query object</returns>
private Query GetLuceneQueryObject(string q, string field)
{
var qt = new QueryParser(field, new StandardAnalyzer((_luceneVersion)));
qt.SetMultiTermRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE);
return qt.Parse(q);
}
}
Thank you for your help today, I'm new to Lucene and I'm trying to figure out how to use that helper now, can you give an example how you use it please?
One thing to note i have all my fields merged into contents a new field injected in and i search on that. You will want to pass in your field you are searching on please note with highlight you can only pass in one field.
Thank you very much Ismail, I'm sorry, but I still struggling to wrap it up and completely lost in searchers, indexes and queries.
What's Results in your code? Is it IEnumerable you get from search provider? In this case I get 'System.Collections.Generic.IEnumerable<Examine.SearchResult>' does not contain a definition for 'LuceneQuery' and no extension method 'LuceneQuery' accepting a first argument of type 'System.Collections.Generic.IEnumerable<Examine.SearchResult>' could be found (are you missing a using directive or an assembly reference?)
public static IndexSearcher GetExternalSearcher()
{
var indexer = ExamineManager.Instance.IndexProviderCollection[Constants.Examine.SteinIasIndexer];
var dir = new DirectoryInfo(((LuceneIndexer)indexer).LuceneIndexFolder.FullName);
FSDirectory directory = FSDirectory.Open(dir);
IndexSearcher searcher = new IndexSearcher(directory, true);
return searcher;
}
Is contents in the index? Can you take a look using examine inspector also what version umbraco you using you can make us e of cogmediaindexer package and then you can do multi index search over both your content and media index. I think i wrote that post about injecting media into content index as older versions of examine did not support multi searcher.
The extraction is based on apache tika see https://tika.apache.org/ which is written in java and i have ikvm bridge thing in my cogumbracomediaindexer. So i wrote my own indexer that gets the media file path and passes it to tika, tika knows what to do with it and it can handle multiple formats it does the extaction and i get back the content and any meta.
Darren Ferguson' s entry date is 10 March 2011, maybe it' s more up to date but not working with version 3.0.3.0
I' m getting configuration error as
Parser Error Message: Field not found: 'Lucene.Net.Util.Version.LUCENE_29'.
Source Error:
Line 10: <ExamineIndexProviders>
Line 11: <providers>
Line 12: <add name="InternalIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine" Line 13: supportUnpublished="true"
Line 14: supportProtected="true"
So after a bunch of wrangling I was able to get the original user control plus Ismail's helper code to compile with no errors and just one warning. However when I actually put it in Umbraco in a user control macro and try to render it all I get is "Error loading Partial View script (file: ~/Views/MacroPartials/search.cshtml)". Is there any way to debug this? I already tried "umbDebugShowTrace=true" but that doesn't show anything.
How do I figure out what's wrong with my user control?
Showing examine search results in a pretty way
Hi guys,
I've set up a site search on my v6.1.6 istallation, it works and returns some results. The next step is to display them properly with the link and some text excerpt with the search term highlighted.
I've seen some articles about it, e.g. http://our.umbraco.org/wiki/how-tos/how-to-highlight-text-in-examine-search-results
But I can't build the helper from that particular tutorial as it gives me tons of errors and doesn't compile.
Has anyone used Lucene.Net.Highlight successfully? Or are there any other ways to do it?
Thank you!
Zakhar,
What are the errors you are getting? With examine / lucene this is the way to highlight matched term.
Regards
Ismail
Hi Ismail, thanks for your reply, here are some of the build errors I can't figure out:
on this line:
Compiler complains
Using the generic type 'System.Collections.Generic.Dictionary<TKey,TValue>' requires 2 type arguments
Is it not a generic Dictionary and am I missing something? I do have all usings as in the example.On this line:
I get:
The type 'Lucene.Net.Search.Query' is defined in an assembly that is not referenced. You must add a reference to assembly 'Lucene.Net, Version=2.0.0.4, Culture=neutral, PublicKeyToken=null'.
The best overloaded method match for Lucene.Net.Highlight.QueryScorer.QueryScorer(Lucene.Net.Search.Query)' has some invalid arguments
I do have a reference to Lucene.Net, which comes with Umbraco, do I need to replace it? I also added the reference to Lucene.Net.Highlight which come from the zip file on that page.
There are other similar errors, so I'm very curious why wouldn't it work for me?
Zakhar,
This was put together a while back by darren ferguson. Since then in later vesrions of umbraco we are using lucene 2.9.2 which is what i guess you are referencing? Let me check my site where i use it as its umbraco 4.11.7 with later vesrion of lucene.
Regards
Ismail
Zakhar,
I have the following running using lucene 2.9.4.1
using System.Collections.Generic; using System.IO; using System.Linq; using Examine.LuceneEngine.SearchCriteria; using Lucene.Net.Analysis; using Lucene.Net.Analysis.Standard; using Lucene.Net.Highlight; using Lucene.Net.Index; using Lucene.Net.QueryParsers; using Lucene.Net.Search; using Version = Lucene.Net.Util.Version;
namespace Domain.Helpers { public class LuceneHighlightHelper { private readonly Version luceneVersion = Version.LUCENE29;
}
Thanks Ismail,
It looks better but I think I have an old version of Lucene.Net.Highlight (2.0.0.1) Do you have newer? Can you share it please?
Zakhar.
Zakhar,
You are right to get my later version goto https://dl.dropboxusercontent.com/u/21109333/Lucene.Net.Contrib.Highlighter.dll
Regards
Ismail
Thanks! I built it, will try the helper next week.
Zakhar.
Hi Ismail,
Thank you for your help today, I'm new to Lucene and I'm trying to figure out how to use that helper now, can you give an example how you use it please?
Cheers
Zakhar,
I used it on webforms website in usercontrol in repeater that binds to search results so my call looks like
Then in my code behind:
One thing to note i have all my fields merged into contents a new field injected in and i search on that. You will want to pass in your field you are searching on please note with highlight you can only pass in one field.
Regards
Ismail
Thank you very much Ismail, I'm sorry, but I still struggling to wrap it up and completely lost in searchers, indexes and queries.
What's
Results
in your code? Is it IEnumerable you get from search provider? In this case I get'System.Collections.Generic.IEnumerable<Examine.SearchResult>' does not contain a definition for 'LuceneQuery' and no extension method 'LuceneQuery' accepting a first argument of type 'System.Collections.Generic.IEnumerable<Examine.SearchResult>' could be found (are you missing a using directive or an assembly reference?)
How do you instantiate/get it?
Same question about
SearchHelper.GetExternalSearcher());
I'm looking at another answer of yours here: http://our.umbraco.org/forum/developers/extending-umbraco/13571-Umbraco-Examine-Search-Results-Highlighting?p=0Does it get the Searcher similar to this?:
Thank ou very much for help!
Zakhar,
GetExternalSearcher looks like
After search i have
That will get you the query
Finally got it sorted with your help, Ismail!
Thank you so much!
Zakhar.
Zakhar,
Awesome. Are you just highlighting on the one field or do you have combined fields in one?
Regards
Ismail
Hi Ismail, I do from one currently,
I use two indexes, PDF and content, I managed to inject pdf index to my content index as discussed here: http://our.umbraco.org/forum/developers/extending-umbraco/13522-Examine-pdf-index-item-inject-into-content-index?p=0
However I'd ideally would like to gather node data to one field as you describe in your article here : http://thecogworks.co.uk/blog/posts/2012/november/examiness-hints-and-tips-from-the-trenches-part-2/ but it doesn't work for some reason, the handler as called and a new data is added to dictionary without errors. But when I list search results my "contents" key is not there.
Regards,
Zakhar
Zakhar,
Is contents in the index? Can you take a look using examine inspector also what version umbraco you using you can make us e of cogmediaindexer package and then you can do multi index search over both your content and media index. I think i wrote that post about injecting media into content index as older versions of examine did not support multi searcher.
Regards
Ismail
Thank you Ismail, I will have a look at your package later, do I understand correctly that it's based on another library, not Lucene?
Zakhar,
The extraction is based on apache tika see https://tika.apache.org/ which is written in java and i have ikvm bridge thing in my cogumbracomediaindexer. So i wrote my own indexer that gets the media file path and passes it to tika, tika knows what to do with it and it can handle multiple formats it does the extaction and i get back the content and any meta.
There is also a pdf indexer in examine that makes use of itextsharp to do the pdf extraction (https://github.com/Shandem/Examine/tree/master/Projects/packages/Examine) also the config setup for it can be found at http://our.umbraco.org/documentation/Reference/Searching/Examine/full-configuration
Regards
Ismail
Hi Ismail,
I' m using your sample code above.
I' m using your dll that is https://dl.dropboxusercontent.com/u/21109333/Lucene.Net.Contrib.Highlighter.dll
But at your code above, you are not referencing this dll, you are using Lucene.Net.Highlight.
I added your dll to the 'References' but there are many errors;
There is an error at this line using Lucene.Net.Highlight; "Highlight does not exist.."
And I could not add "using Lucene.Net.Contrib.Highlighter", there is no class named Contrib in Lucene.net
And because of dll problem, I have many 'could not be found' errors.
Question,
Take a look at http://our.umbraco.org/projects/website-utilities/lucene-search-result-highlithing this is more up to date.
Regards
Ismail
Ismail,
Darren Ferguson' s entry date is 10 March 2011, maybe it' s more up to date but not working with version 3.0.3.0
I' m getting configuration error as
Parser Error Message: Field not found: 'Lucene.Net.Util.Version.LUCENE_29'.
Source Error:
Line 10: <ExamineIndexProviders> Line 11: <providers> Line 12: <add name="InternalIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine" Line 13: supportUnpublished="true" Line 14: supportProtected="true"
Source File: ..\config\ExamineSettings.config Line: 12
Version Information: Microsoft .NET Framework Version:4.0.30319; ASP.NET Version:4.0.30319.34212
You are using lucene 3.0.3.0? I thought Examine shipped with lucene 292?
Regards
Ismial
So after a bunch of wrangling I was able to get the original user control plus Ismail's helper code to compile with no errors and just one warning. However when I actually put it in Umbraco in a user control macro and try to render it all I get is "Error loading Partial View script (file: ~/Views/MacroPartials/search.cshtml)". Is there any way to debug this? I already tried "umbDebugShowTrace=true" but that doesn't show anything.
How do I figure out what's wrong with my user control?
Hi,
How we can achieve this in Version 8. My code looks like
Perform Search
I can't find a way to highlight using Lucene Contrib library.
Anyone please guide me.
is working on a reply...