using examine in a multiple multilingual site setup

Sébastien Richer 194 posts 430 karma points

Nov 13, 2012 @ 23:22

I have this setup :

SiteA
- FR
- EN

SiteB
- FR
- EN

And I'm trying to find documentation on how to define the Examin indexes so that I have 4 seperate indexes in this scenario. That way my razor search page could select the appropriate search index to run the search query.

Anyone has done that before?

Thanks a bunch!

Sébastien Richer

Copy Link

Stephen 767 posts 2273 karma points c-trib

Nov 14, 2012 @ 15:27

Yes. Multiple sites, multiple languages.

We've created one index per site/language so 4 indexes in your case, each index uses the proper Lucene Analyzer (ie french analyzer for the french section). And on the (razor) search page we pick the right index depending on the site/language. Works quite well.

For another setup we're currently trying to setup only one index per language and index everything in each index, then tweak the query to fetch documents under the proper root node only. It gives bigger indexes _but_ if you start having many sites, you don't have to create new indexes each time. Only if a new language comes up.

Stephan

Copy Link

Sébastien Richer 194 posts 430 karma points

Nov 14, 2012 @ 15:35

Hi Stephen,

Great news (I was hooking up in the back end API node event to specify language hehe) so how can I "split" these indexes. Here is what I currently have.

In my ExamineSettings.config, for examine index providers:

      <add name="HomeNodeIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
           supportUnpublished="false"
           supportProtected="true"
           interval="10"
           analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>

      <add name="FRNodeIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
           supportUnpublished="false"
           supportProtected="true"
           interval="10"
           analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>

      <add name="ENNodeIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
           supportUnpublished="false"
           supportProtected="true"
           interval="10"
           analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>

Then in search providers:

      <add name="HomeNodeSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
           analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>

      <add name="FRNodeSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
           analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>

      <add name="ENNodeSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
           analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>

In my ExamineIndex.config:

  <!-- Home node index -->
  <IndexSet SetName="HomeNodeIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/HomeNode/">
    <IndexAttributeFields>
      <add Name="id" />
      <add Name="nodeName" />
      <add Name="updateDate" />
      <add Name="writerName" />
      <add Name="path" />
      <add Name="nodeTypeAlias" />
      <add Name="parentID" />
    </IndexAttributeFields>
    <ExcludeNodeTypes>
      <add Name="OffresDEmplois"/>
      <add Name="Recherche"/>
      <add Name="CandidatureSpontanee"/>
    </ExcludeNodeTypes>
  </IndexSet>

  <!-- FR node index -->
  <IndexSet SetName="FRNodeIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/FRNode/">
    <IndexAttributeFields>
      <add Name="id" />
      <add Name="nodeName" />
      <add Name="updateDate" />
      <add Name="writerName" />
      <add Name="path" />
      <add Name="nodeTypeAlias" />
      <add Name="parentID" />
    </IndexAttributeFields>
    <ExcludeNodeTypes>
      <add Name="OffresDEmplois"/>
      <add Name="Recherche"/>
      <add Name="CandidatureSpontanee"/>
    </ExcludeNodeTypes>
  </IndexSet>

  <!-- EN node index -->
  <IndexSet SetName="ENNodeIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/ENNode/">
    <IndexAttributeFields>
      <add Name="id" />
      <add Name="nodeName" />
      <add Name="updateDate" />
      <add Name="writerName" />
      <add Name="path" />
      <add Name="nodeTypeAlias" />
      <add Name="parentID" />
    </IndexAttributeFields>
    <ExcludeNodeTypes>
      <add Name="OffresDEmplois"/>
      <add Name="Recherche"/>
      <add Name="CandidatureSpontanee"/>
    </ExcludeNodeTypes>
  </IndexSet>

And then in my search macroscript:

var s = Request.QueryString["s"];
        if (!string.IsNullOrEmpty(s))
        {
            var searcher = ExamineManager.Instance.SearchProviderCollection["HomeNodeSearcher"];
            var criteria = searcher.CreateSearchCriteria(BooleanOperation.Or);
            var filter = criteria.GroupedOr(new string[] {"nodeName", "texte", "title", "nodeTypeAlias"}, s).Not().Field("umbracoNaviHide", "1").Compile();
            var results = searcher.Search(filter);
        }

Now I can easily identify my current language there and use another searcher, my problem is that I don't know how to specify that this searcher would use an index that would start it's index not from site root but from my specified language node. How did you specify that?

Thanks a lot!

Sébastien

Copy Link

Stephen 767 posts 2273 karma points c-trib

Nov 14, 2012 @ 15:40

We have...

<IndexSet SetName="IndexGermanyDE" IndexPath="~/App_Data/ExamineIndexes/SearchGermanyDE" IndexParentId="10805">

Where IndexParentId is the identifier of the root node. Will index that node and everything below it, not the other nodes.

Stephan

Copy Link

Sébastien Richer 194 posts 430 karma points

Nov 14, 2012 @ 16:27

Ohh silly me I could have tried it, I thought the IndexParentId would be very specify and index only children (which makes no sense). So it's behavior is actually like IndexAncertorId and it indexes descendants, from that point. Ok I'll try it right away!

Thanks a bunch Stephen!

Copy Link

Sébastien Richer 194 posts 430 karma points

Nov 14, 2012 @ 16:57

Ok so it seems I can't find my descendants with this setup. Only the direct children of my specified IndexParentId. :(

In you setup Stephen, you find nodes that are descendants?

Thanks for the help

            var currentLanguage = CurrentModel.AncestorOrSelf("AccueilLangage").Name.ToUpperInvariant();
            var searcherName = currentLanguage + "NodeSearcher";

            var searcher = ExamineManager.Instance.SearchProviderCollection[searcherName];
            var criteria = searcher.CreateSearchCriteria(BooleanOperation.Or);
            var filter = criteria.GroupedOr(new string[] {"nodeName", "texte", "title", "nodeTypeAlias"}, s).Not().Field("umbracoNaviHide", "1").Compile();
            var results = searcher.Search(filter);

Copy Link

Stephen 767 posts 2273 karma points c-trib

Nov 14, 2012 @ 17:00

well, yes. maybe you need to rebuild the indexes? can you also try running Luke to check what you actually have in the indexes?

Copy Link

Sébastien Richer 194 posts 430 karma points

Nov 14, 2012 @ 17:05

Yes ok I'm getting results! I just need to tweak for French accents and other stuff like ignore case. What analyser did you say you use for french?

Copy Link

Stephen 767 posts 2273 karma points c-trib

Nov 14, 2012 @ 17:09

Depends. If you're willing to rebuild Examine with Lucene 2.4.9g (latest) then I *think* there's an out-of-the-box french analyzer.

If you want to stick with stock Examine then you'll have to create your own french analyser I'm afraid. We have one, though, that I believe I could share. Does accents, stemming, stopwords, stuff like that.

Copy Link

Sébastien Richer 194 posts 430 karma points

Nov 14, 2012 @ 19:34

Ok well it seams that using .Standard.StandardAnalyzer solves all my problems! Alright I think I'm all set here! Thanks again Stephen!

Copy Link

is working on a reply...

Flag this post as spam?

Using examine in a multiple multilingual site setup