change how umbracoexamine stores rich content fields

Press Ctrl / CMD + C to copy this to your clipboard.

Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at

Sam 47 posts 153 karma points

Jan 07, 2013 @ 22:32

0

Change how Umbraco/Examine Stores Rich Content Fields?

I have Examine up and running now.

It was probably working before, but I didn't realize the WhiteSpaceAnalyzer that is on by default is case-sensitive. Moved it over to StandardAnalyzer and it almost works the way I want.

The problem is that when rich content (WYSIWYG) content is saved, all the tags are stripped out but are not replaced with whitespace. The end result is that the words before and after tags are smushed together in the index making those words unsearchable.

Is there any way to change Umbraco so it inserts a space for every tag stripped out?

Also, is there a way to over-ride the stop words?

Copy Link

Nathan Woulfe 447 posts 1665 karma points MVP 5x hq c-trib

Jan 07, 2013 @ 23:49

Hi Sam

Not sure about replacing stripped tags with whitespace, but it is possible to force Examine to index the RTE content with tags in place, if that helps? The code below creates a field in the index containing the rich content from the bodyText field

namespace EventHandlers
{
public class EvendHandlers : ApplicationBase
{
public EventHandlers()
{
var indexer = (LuceneIndexer)ExamineManager.Instance.IndexProviderCollection["YourIndexName"];
indexer.GatheringNodeData += new EventHandler<IndexingNodeDataEventArgs>(indexer_GatheringNodeData);
} 
void indexer_GatheringNodeData(object sender, IndexingNodeDataEventArgs e) 
{
XElement node = e.Node;
XElement elementBodyText = node.Element("bodyText");
if (elementBodyText != null)
{
e.Fields.Add("BodyTextWithTags", elementBodyText.Value);
}
}
}
}

Copy Link

Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib

Jan 08, 2013 @ 10:06

0

Sam,

With regards to stopwords see http://our.umbraco.org/forum/developers/extending-umbraco/25600-Examine-case-insensitive-keyword-search

Regards

Ismail

Copy Link
is working on a reply...

This forum is in read-only mode while we transition to the new forum.

You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Please Sign in or register to post replies

Flag this post as spam?

Change how Umbraco/Examine Stores Rich Content Fields?