Hi im having some issue with the lucene index. For example suppose i have a doctype called customer which in tern has a property called name, when i create a document and enter the name property as "John Doe" it is stored in the index as "john doe" how do i retain the correct casing ?
You have to store both a case sensitive and case insensitive data as Lucene isn't really designed for data retrieval.
To do this with Examine you have to attach to the UmbracoExamine.LuceneExamineIndexer.DocumentWriting event (which may have moved into the LuceneEngine with the latest check ins, I'm not 100% sure).
This event is fired in a Lucene-scope as provides you with access to the Lucene Document object as it's being written to, and in which you'll need to add your un-analyzed version of the content.
Here's an example of how we did it in a recent project for showing in search results:
void indexer_DocumentWriting(object sender,DocumentWritingEventArgs e) { var doc = e.Document;
// Find the title string title =!e.Fields.ContainsKey("PageTitle")||string.IsNullOrEmpty(e.Fields["PageTitle"])? e.Fields["nodeName"]: e.Fields["PageTitle"];
// Default content is nothing: string content =string.Empty; // Unless a description is found: if(e.Fields.ContainsKey("Description")&&!string.IsNullOrEmpty(e.Fields["Description"])) { content = e.Fields["Description"]; } // Or BodyContent is found: elseif(e.Fields.ContainsKey("BodyContent")&&!string.IsNullOrEmpty(e.Fields["BodyContent"])) { content = e.Fields["BodyContent"]; }
// Store the title and content with text casing unchanged doc.Add(newField("__PageTitle", title,Field.Store.YES,Field.Index.NOT_ANALYZED)); doc.Add(newField("__Content", content,Field.Store.YES,Field.Index.NOT_ANALYZED)); }
And when we display it in the search results we end up with showing the __PageTitle and __Content field, not the 'real' fields.
Thats cool exactly what i was looking for :) one stupid question how / where do i attach to the event ? from within my code so that it executes when the index runs ?
Lucene Index and case
Hi im having some issue with the lucene index. For example suppose i have a doctype called customer which in tern has a property called name, when i create a document and enter the name property as "John Doe" it is stored in the index as "john doe" how do i retain the correct casing ?
This answer was originally posted here: http://our.umbraco.org/forum/developers/extending-umbraco/10999-Examine-Questions?p=0#comment42951
You have to store both a case sensitive and case insensitive data as Lucene isn't really designed for data retrieval.
To do this with Examine you have to attach to the UmbracoExamine.LuceneExamineIndexer.DocumentWriting event (which may have moved into the LuceneEngine with the latest check ins, I'm not 100% sure).
This event is fired in a Lucene-scope as provides you with access to the Lucene Document object as it's being written to, and in which you'll need to add your un-analyzed version of the content.
Here's an example of how we did it in a recent project for showing in search results:
Thats cool exactly what i was looking for :) one stupid question how / where do i attach to the event ? from within my code so that it executes when the index runs ?
You can use ApplicationBase like it's an Umbraco event, or you can use a HttpModule and wire it up early in the life cycle.
I'd go with ApplicationBase personally.
is working on a reply...