Currently I've got a simple Examine implementation, but when I search I also get search results from another lanague. Is there a way for me to only search in a specific language? The languages are different nodes so maybe set a start node somewhere?
That's a nice solution! In the results I get back I can than filter by language. What would be the best way to store the language as an addition field? Get the home node from the current node and store that name or something? Any other suggestions?
If each language has their own home node, then that sounds ideal - just store the language specific home node ID associated with every node (guaranteed to be unique, unlike a node name)
When you have a site with multiple languages I think it would be a better idea to use multiple indexes. It sounds like you already have a setup in Umbraco where the root nodes defines the different languages, so you can simply add the IndexParentId attribute with id of your root node to your ExamineIndex.config (one indexset per site/language):
That seems like a better solution. So I can set the parentId on the index and in my code I use a different index based on the current language. A simple switch case to determine which index should be used seems sufficient. Thanks!
Yes, exactly. I've done it for a multi-site solution and works quite well. With a collection of indexes we used naming conventions to lookup the right search index, as we use the same control for searching/results for all sites. Something like IndexSiteName ... well, you get the idea ;)
@Morten, do you have an example of where you've used a different indexer for one of the indexes? I'm looking into this for a project I'm working on, but I'm not sure where I'd find different language analyzers!
No, I'm afraid I have only ever used the Standard- and Keyword analyzers that comes with Lucene.NET. The few language specific analyzers I have come across has only been available for the Java version, so I haven't look futher into it and the standard analyzers have worked fine for my needs (usually Danish and sometimes English).
Examine search per language
Hello,
Currently I've got a simple Examine implementation, but when I search I also get search results from another lanague. Is there a way for me to only search in a specific language? The languages are different nodes so maybe set a start node somewhere?
Jeroen
Hi Jeroen,
How about handling the GatheringNodeData event and include an additional field that specifies the language of the current node being indexed ?
HTH,
Hendy
That's a nice solution! In the results I get back I can than filter by language. What would be the best way to store the language as an addition field? Get the home node from the current node and store that name or something? Any other suggestions?
Jeroen
If each language has their own home node, then that sounds ideal - just store the language specific home node ID associated with every node (guaranteed to be unique, unlike a node name)
Hendy
Hi Jeroen,
When you have a site with multiple languages I think it would be a better idea to use multiple indexes. It sounds like you already have a setup in Umbraco where the root nodes defines the different languages, so you can simply add the IndexParentId attribute with id of your root node to your ExamineIndex.config (one indexset per site/language):
This also gives you the possibility to add a Lucene Analyzer that fits your laugange.
http://www.aaron-powell.com/lucene-analyzer
- Morten
Hi Morten,
That seems like a better solution. So I can set the parentId on the index and in my code I use a different index based on the current language. A simple switch case to determine which index should be used seems sufficient. Thanks!
Jeroen
Yes, exactly. I've done it for a multi-site solution and works quite well. With a collection of indexes we used naming conventions to lookup the right search index, as we use the same control for searching/results for all sites. Something like IndexSiteName ... well, you get the idea ;)
- Morten
@Morten, do you have an example of where you've used a different indexer for one of the indexes? I'm looking into this for a project I'm working on, but I'm not sure where I'd find different language analyzers!
Hi Tim,
No, I'm afraid I have only ever used the Standard- and Keyword analyzers that comes with Lucene.NET.
The few language specific analyzers I have come across has only been available for the Java version, so I haven't look futher into it and the standard analyzers have worked fine for my needs (usually Danish and sometimes English).
Maybe these posts could be of some use/inspiration:
http://our.umbraco.org/forum/developers/extending-umbraco/16396-Examine-and-accents-%28for-portuguese-language%29
http://wiki.apache.org/jakarta-lucene/LuceneFAQ#How_do_I_write_my_own_Analyzer.3F
http://read.pudn.com/downloads52/sourcecode/windows/csharp/178147/Lucene.Net/Analysis/RU/RussianAnalyzer.cs__.htm ; (example of a russian analyzer).
- Morten
is working on a reply...