Ismail - is there any way to force an index for all media items or is the index method fired when a page that contains a media file is published? also what field name is the index content from the media file stored as? I would like to combine the media index along with the public webpage index into a single results set. Is this possible? Do you have any search result page samples? many thanks in advance!
The field name if I remember rightly is FileTextContent but just to make sure download luke and take a look inside your index http://code.google.com/p/luke/
To combine search results from any number of indexes you need to use code like the following (ps very old versions of examine do not support cross index searching. Not sure when it first appeared)
public static MultiIndexSearcher GetMultiSearcher(string[] indexes)
{
var directories = new List<DirectoryInfo>();
foreach (var index in indexes)
{
var indexer = ExamineManager.Instance.IndexProviderCollection[index];
var dir = new DirectoryInfo(((LuceneIndexer)indexer).LuceneIndexFolder.FullName.Replace("\\Index",""));
directories.Add(dir);
}
var i = new MultiIndexSearcher(directories, new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
return i;
}
You will obvisouly have to update code to meet your needs but I have a site where I have used the above code so search across umbraco index and a database index.
Ismail - thank so much for the feedback! I've gotten everything setup and am able to use Luke to search the media index but when I try searching the media index via web search I'm not getting any search results back for the same query I used in luke. no matter what I search, no reasults are found. My content search index works fine. Below is a sample of my code...
After the Searcher.Search method run the following searchCriteria.ToString() and write out the results it will give you the generated query then run that query using luke. Also can you paste the query here. Can you also the paste the query you ran in luke.
the second paste is the correct one. Run the query again but this time in the dropdown change analyser to standard. You currently have it set to whitespace.
Ismail - searching only on the node name worked. I'll see if I can create a custom filtered field removing these characters and then search on that. Thank you so much for you help Ismail. Hopefully will have a workable solution before much longer.
thank you again for all your help Ismail! It's very much appreciated! I'm rebuilding the index (we have over 13k media files) now and hope to have everything working in the next couple days...
Ismail - hopefully the last thing I'll need help with. I need to re-index the media index but when using the tools you recommend, it says the index has been added to the queue but it does not appear to start indexing... is there a way I can delete the actual index files and then force the index to start automatically? again, thank you for all your help!
Use Darren Fergusons index manager see http://our.umbraco.org/projects/developer-tools/examine-dashboard its a better package. Just be aware if your using latest examine the dashboard control may throw errors just follow the line numbers and comment out offending lines in the usercontrl and it will work.
Could not load control: '/usercontrols/IndexStatus.ascx'. Error message: System.Web.HttpCompileException (0x80004005): c:\inetpub\wwwroot_sandbox\usercontrols\IndexStatus.ascx(352): error CS1061: 'Examine.Providers.BaseIndexProvider' does not contain a definition for 'SupportUnpublishedContent' and no extension method 'SupportUnpublishedContent' accepting a first argument of type 'Examine.Providers.BaseIndexProvider' could be found (are you missing a using directive or an assembly reference?) at System.Web.Compilation.BuildManager.PostProcessFoundBuildResult(BuildResult result, Boolean keyFromVPP, VirtualPath virtualPath) at System.Web.Compilation.BuildManager.GetBuildResultFromCacheInternal(String cacheKey, Boolean keyFromVPP, VirtualPath virtualPath, Int64 hashCode, Boolean ensureIsUpToDate) at System.Web.Compilation.BuildManager.GetVPathBuildResultFromCacheInternal(VirtualPath virtualPath, Boolean ensureIsUpToDate) at System.Web.Compilation.BuildManager.GetVPathBuildResultInternal(VirtualPath virtualPath, Boolean noBuild, Boolean allowCrossApp, Boolean allowBuildInPrecompile, Boolean throwIfNotFound, Boolean ensureIsUpToDate) at System.Web.Compilation.BuildManager.GetVPathBuildResultWithNoAssert(HttpContext context, VirtualPath virtualPath, Boolean noBuild, Boolean allowCrossApp, Boolean allowBuildInPrecompile, Boolean throwIfNotFound, Boolean ensureIsUpToDate) at System.Web.Compilation.BuildManager.GetVPathBuildResult(HttpContext context, VirtualPath virtualPath, Boolean noBuild, Boolean allowCrossApp, Boolean allowBuildInPrecompile, Boolean ensureIsUpToDate) at System.Web.UI.TemplateControl.LoadControl(VirtualPath virtualPath) at umbraco.cms.presentation.dashboard.OnInit(EventArgs e)
Hmm Ok stop the app pool. Delete the Index folder in PublicMediaIndexSet then start the app pool. See if you get anything then. If that dont work then paste screenshot of the log table again but this time filtered for Error only.
Ismail - I've located the problem. it's the FileTextContent field. it appears to be crashing our the index handler. Any idea on how to deal with this field and store it in another field? I've tried trapping it in the handler to no avail...
public void GatheringMediaNodeDataHandler(object sender, IndexingNodeDataEventArgs e)
It would seem as though one or more of the files is crashing the indexing. So wrap the e.Fields.Add("bodyText", e.Fields["FileTextContent"].Replace("\n", " ").Replace("\r", " ")); in a try catch and in the catch log error with the media item its trying to process. This way it should all index minus filecontents of problamatic files however you will know which ones are the problem and why. Then we can look at how to get round it.
Ismail - this is bizarre... added error trapping and no errors are reported but now FileTextContent is not storing anything. will delete all the mapping fields and start from scratch... also there were a ton of other fields being indexed... doesn't appear to be indexing those now...
in regards to the fields, I bet it's because I added the indexSet property and it's setting up the index based on the defined properties... still not FileTextContent indexed... I'm going to removing the inexSet property and try it again...
removed the IndexSet property from the config file, stopped/restarted app pool and website and then defined fields are still being used. this has to be cached somewhere. even tried changing the index location and it's still not working... any idea? thx!
Error loading application startup handler: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.TypeInitializationException: The type initializer for 'Examine.ExamineManager' threw an exception. ---> System.Configuration.ConfigurationErrorsException: Value cannot be null. Parameter name: indexSet on LuceneExamineIndexer provider has not been set in configuration and/or the IndexerData property has not been explicitly set (C:\inetpub\wwwroot_sandbox\config\ExamineSettings.config line 15) ---> System.ArgumentNullException: Value cannot be null. Parameter name: indexSet on LuceneExamineIndexer provider has not been set in configuration and/or the IndexerData property has not been explicitly set at Examine.LuceneEngine.Providers.LuceneIndexer.Initialize(String name, NameValueCollection config) at UmbracoExamine.BaseUmbracoIndexer.Initialize(String name, NameValueCollection config) at CogUmbracoExamineMediaIndexer.MediaIndexer.Initialize(String name, NameValueCollection config) at System.Web.Configuration.ProvidersHelper.InstantiateProvider(ProviderSettings providerSettings, Type providerType) --- End of inner exception stack trace --- at System.Web.Configuration.ProvidersHelper.InstantiateProvider(ProviderSettings providerSettings, Type providerType) at System.Web.Configuration.ProvidersHelper.InstantiateProviders(ProviderSettingsCollection configProviders, ProviderCollection providers, Type providerType) at Examine.ExamineManager.LoadProviders() at Examine.ExamineManager..cctor() --- End of inner exception stack trace --- at UmbracoExamine.UmbracoEventManager..ctor() --- End of inner exception stack trace --- at System.RuntimeTypeHandle.CreateInstance(RuntimeType type, Boolean publicOnly, Boolean noCheck, Boolean& canBeCached, RuntimeMethodHandleInternal& ctor, Boolean& bNeedSecurityCheck) at System.RuntimeType.CreateInstanceSlow(Boolean publicOnly, Boolean skipCheckThis, Boolean fillCache) at System.RuntimeType.CreateInstanceDefaultCtor(Boolean publicOnly, Boolean skipVisibilityChecks, Boolean skipCheckThis, Boolean fillCache) at System.Activator.CreateInstance(Type type, Boolean nonPublic) at umbraco.businesslogic.ApplicationStartupHandler..cctor()
after I add the indexSet property, FileTextContent is still empty and not being indexed...
Ismail - incredibly bizarre but i went ahead and rolled everything out to our production environment, copied over the examine config settings exactly in all files, rebuilt the new indexes and everything is working... seems very strangle our sandbox environment went crazy like that...
Rebuild Index and index content
Ismail - is there any way to force an index for all media items or is the index method fired when a page that contains a media file is published? also what field name is the index content from the media file stored as? I would like to combine the media index along with the public webpage index into a single results set. Is this possible? Do you have any search result page samples? many thanks in advance!
John,
There are 2 projects on our that can help you rebuild any index completely there is my mickey mouse project http://our.umbraco.org/projects/backoffice-extensions/examine-index-admin and the more advanced http://our.umbraco.org/projects/developer-tools/examine-dashboard this second one may not work with latest version of examine so worth trying it. Although if you follow the errors you will see that its breaking due to functionality that has been deprecated in examine so if you were to comment out offending lines in the ascx it will run.
The field name if I remember rightly is FileTextContent but just to make sure download luke and take a look inside your index http://code.google.com/p/luke/
To combine search results from any number of indexes you need to use code like the following (ps very old versions of examine do not support cross index searching. Not sure when it first appeared)
MultiIndexSearcher searcher = WebHelpers.GetMultiSearcher(new[] {"directoryIndexer", "ATGIndexer"});
var criteria = searcher.CreateSearchCriteria();
var query = criteria.NodeTypeAlias(Constants.DirectoryItemAlias);
query = query.Not().Field("umbracoNaviHide", 1.ToString());
query = query.And().OrderBy("nodeName");
var results = searcher.Search(query.Compile());
public static MultiIndexSearcher GetMultiSearcher(string[] indexes)
{
var directories = new List<DirectoryInfo>();
foreach (var index in indexes)
{
var indexer = ExamineManager.Instance.IndexProviderCollection[index];
var dir = new DirectoryInfo(((LuceneIndexer)indexer).LuceneIndexFolder.FullName.Replace("\\Index",""));
directories.Add(dir);
}
var i = new MultiIndexSearcher(directories, new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
return i;
}
You will obvisouly have to update code to meet your needs but I have a site where I have used the above code so search across umbraco index and a database index.
Regards
Ismail
Ismail - thank so much for the feedback! I've gotten everything setup and am able to use Luke to search the media index but when I try searching the media index via web search I'm not getting any search results back for the same query I used in luke. no matter what I search, no reasults are found. My content search index works fine. Below is a sample of my code...
Searcher = ExamineManager.Instance.SearchProviderCollection["MediaSearcher"];
var searchCriteria = Searcher.CreateSearchCriteria();
Examine.SearchCriteria.ISearchCriteria query = searchCriteria.Field("FileTextContent", tbx_searchTerm.Text).Or().Field("NodeName", tbx_searchTerm.Text).Compile();
Results = (SearchResults)Searcher.Search(query);
dg_searchresults.VirtualItemCount = Results.TotalItemCount;
dg_searchresults.PageSize = Convert.ToInt32(ddl_recordsPerPage.SelectedValue);
dg_searchresults.CurrentPageIndex = pageIndex;
dg_searchresults.DataSource = Results.Skip(pageIndex * dg_searchresults.PageSize);
dg_searchresults.DataBind();
John,
After the Searcher.Search method run the following searchCriteria.ToString() and write out the results it will give you the generated query then run that query using luke. Also can you paste the query here. Can you also the paste the query you ran in luke.
Regards
Ismail
Thank you Ismail!
Below is the searchCriteria.ToString() data which returns 0 records:
searchCriteria={ SearchIndexType: , LuceneQuery: +FileTextContent:law NodeName:law }
whenever I do a straight paste into Luke, I get the following error:
when I only submit the query portion, I get the following results:
the second paste is the correct one. Run the query again but this time in the dropdown change analyser to standard. You currently have it set to whitespace.
Ismail - that found records as well. snapshot below:
In luke can u pick an item that has match then double click it and then paste screen shot.
Regards
Ismail
Ismail - is this what you need?
Does it work when you only search on node name? In the code that is not luke. I am begining to suspect maybe the \r\n has something todo with it?
Ismail - searching only on the node name worked. I'll see if I can create a custom filtered field removing these characters and then search on that. Thank you so much for you help Ismail. Hopefully will have a workable solution before much longer.
John,
Just implement gatheringNode data event and for the filetextcontent field strip out \n\r and put the content back into the field see
public class ExamineEvents:ApplicationBase
{
public ExamineEvents()
{
ExamineManager.Instance.IndexProviderCollection["MyIndex"].GatheringNodeData += MediaIndex_GatheringNodeData;
}
void MediaIndex_GatheringNodeData(object sender, IndexingNodeDataEventArgs e)
{
var filecontents = e.Fields["FileTextContents"];
e.Fields["FileTextContents"]=filecontents.replace("\n"," ").replace("\r"," ");
}
}
}
Regards
Ismail
thx Ismail - had one in place already so I just modified it to also match up the field names on both indexes.
public class ExamineIndexHandler : umbraco.businesslogic.ApplicationStartupHandler
{
public void GatheringContentNodeDataHandler(object sender, IndexingNodeDataEventArgs e)
{
//get rid of commas so field data is searchable
e.Fields.Add("searchablePath", e.Fields["path"].Replace(',', ' '));
}
public void GatheringMediaNodeDataHandler(object sender, IndexingNodeDataEventArgs e)
{
//get rid of commas so field data is searchable
e.Fields.Add("searchablePath", e.Fields["path"].Replace(',', ' '));
e.Fields.Add("bodyText", e.Fields["FileTextContent"].Replace('\n', ' ').Replace('\r', ' '));
e.Fields.Add("pageTitle", e.Fields["nodeName"]);
e.Fields.Add("navTitle", e.Fields["nodeName"]);
}
public ExamineIndexHandler()
{
// code before base oninit
var contentIndexer = ExamineManager.Instance.IndexProviderCollection["PublicIndexer"];
var mediaIndexer = ExamineManager.Instance.IndexProviderCollection["MediaIndexer"];
contentIndexer.GatheringNodeData += GatheringContentNodeDataHandler;
mediaIndexer.GatheringNodeData += GatheringMediaNodeDataHandler;
}
}
thank you again for all your help Ismail! It's very much appreciated! I'm rebuilding the index (we have over 13k media files) now and hope to have everything working in the next couple days...
Ismail - hopefully the last thing I'll need help with. I need to re-index the media index but when using the tools you recommend, it says the index has been added to the queue but it does not appear to start indexing... is there a way I can delete the actual index files and then force the index to start automatically? again, thank you for all your help!
John,
Use Darren Fergusons index manager see http://our.umbraco.org/projects/developer-tools/examine-dashboard its a better package. Just be aware if your using latest examine the dashboard control may throw errors just follow the line numbers and comment out offending lines in the usercontrl and it will work.
Regards
Ismail
Ismail - below is the error I'm getting...
Could not load control: '/usercontrols/IndexStatus.ascx'.
Error message: System.Web.HttpCompileException (0x80004005): c:\inetpub\wwwroot_sandbox\usercontrols\IndexStatus.ascx(352): error CS1061: 'Examine.Providers.BaseIndexProvider' does not contain a definition for 'SupportUnpublishedContent' and no extension method 'SupportUnpublishedContent' accepting a first argument of type 'Examine.Providers.BaseIndexProvider' could be found (are you missing a using directive or an assembly reference?) at System.Web.Compilation.BuildManager.PostProcessFoundBuildResult(BuildResult result, Boolean keyFromVPP, VirtualPath virtualPath) at System.Web.Compilation.BuildManager.GetBuildResultFromCacheInternal(String cacheKey, Boolean keyFromVPP, VirtualPath virtualPath, Int64 hashCode, Boolean ensureIsUpToDate) at System.Web.Compilation.BuildManager.GetVPathBuildResultFromCacheInternal(VirtualPath virtualPath, Boolean ensureIsUpToDate) at System.Web.Compilation.BuildManager.GetVPathBuildResultInternal(VirtualPath virtualPath, Boolean noBuild, Boolean allowCrossApp, Boolean allowBuildInPrecompile, Boolean throwIfNotFound, Boolean ensureIsUpToDate) at System.Web.Compilation.BuildManager.GetVPathBuildResultWithNoAssert(HttpContext context, VirtualPath virtualPath, Boolean noBuild, Boolean allowCrossApp, Boolean allowBuildInPrecompile, Boolean throwIfNotFound, Boolean ensureIsUpToDate) at System.Web.Compilation.BuildManager.GetVPathBuildResult(HttpContext context, VirtualPath virtualPath, Boolean noBuild, Boolean allowCrossApp, Boolean allowBuildInPrecompile, Boolean ensureIsUpToDate) at System.Web.UI.TemplateControl.LoadControl(VirtualPath virtualPath) at umbraco.cms.presentation.dashboard.OnInit(EventArgs e)
ismail - I found the Examine Dashboar source online and am updating to support umbraco 4.9... digging through it now...
Just comment out the offending lines in the usercontrol and it should run minus some information which is not essential.
Ismail - I cannot get an index to build correctly... all I see if the following now. any ideas? thx!
John,
Can you check the umbraco log table you should have some entries there that may explain why its not working?
Regards
Ismail
Ismail - I'm not seeing the indexed even (PublicMediaIndexer) mentioned in the custom log so I'm assuming it's not working (no errors reported)...
ExamineSettings.config:
<?xml version="1.0"?>
<!--
Umbraco examine is an extensible indexer and search engine.
This configuration file can be extended to add your own search/index providers.
Index sets can be defined in the ExamineIndex.config if you're using the standard provider model.
More information and documentation can be found on CodePlex: http://umbracoexamine.codeplex.com
-->
<Examine>
<ExamineIndexProviders>
<providers>
<add name="InternalIndexer" type="UmbracoExamine.LuceneExamineIndexer, UmbracoExamine" runAsync="true" supportUnpublished="true" supportProtected="true" interval="10" analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net" />
<add name="InternalMemberIndexer" type="UmbracoExamine.UmbracoMemberIndexer, UmbracoExamine" runAsync="true" supportUnpublished="false" supportProtected="true" interval="10" analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net" />
<add name="PublicIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine" indexSet="PublicIndexSet" supportUnpublished="false" supportProtected="false" runAsync="true" interval="5" analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net" enableDefaultEventHandler="true" />
<add name="PublicMediaIndexer" type="CogUmbracoExamineMediaIndexer.MediaIndexer, CogUmbracoExamineMediaIndexer" indexSet="PublicMediaIndexSet" supportUnpublished="false" supportProtected="false" runAsync="true" interval="5" extensions=".pdf,.doc,.docx,.xls,.ppt" umbracoFileProperty="umbracoFile" />
</providers>
</ExamineIndexProviders>
<ExamineSearchProviders defaultProvider="InternalSearcher">
<providers>
<add name="InternalSearcher" type="UmbracoExamine.LuceneExamineSearcher, UmbracoExamine" analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net" />
<add name="InternalMemberSearcher" type="UmbracoExamine.LuceneExamineSearcher, UmbracoExamine" analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net" enableLeadingWildcards="true" />
<add name="PublicSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine" indexSet="PublicIndexSet" analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net" enableLeadingWildcards="true" />
<add name="PublicMediaSearcher" type="UmbracoExamine.LuceneExamineSearcher, UmbracoExamine" indexSet="PublicMediaIndexSet" analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net" />
</providers>
</ExamineSearchProviders>
</Examine>
ExamineIndex.config:
<?xml version="1.0"?>
<!--
Umbraco examine is an extensible indexer and search engine.
This configuration file can be extended to create your own index sets.
Index/Search providers can be defined in the UmbracoSettings.config
More information and documentation can be found on CodePlex: http://umbracoexamine.codeplex.com
-->
<ExamineLuceneIndexSets>
<!-- The internal index set used by Umbraco back-office - DO NOT REMOVE -->
<IndexSet SetName="InternalIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/Internal/">
<IndexAttributeFields>
<add Name="id" />
<add Name="nodeName" />
<add Name="updateDate" />
<add Name="writerName" />
<add Name="path" />
<add Name="nodeTypeAlias" />
<add Name="parentID" />
</IndexAttributeFields>
<IndexUserFields />
<IncludeNodeTypes />
<ExcludeNodeTypes />
</IndexSet>
<!-- The internal index set used by Umbraco back-office for indexing members - DO NOT REMOVE -->
<IndexSet SetName="InternalMemberIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/InternalMember/">
<IndexAttributeFields>
<add Name="id" />
<add Name="nodeName" />
<add Name="updateDate" />
<add Name="writerName" />
<add Name="loginName" />
<add Name="email" />
<add Name="nodeTypeAlias" />
</IndexAttributeFields>
<IndexUserFields />
<IncludeNodeTypes />
<ExcludeNodeTypes />
</IndexSet>
<IndexSet SetName="PublicIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/Public/">
<IndexAttributeFields>
<add Name="id" />
<add Name="nodeName" />
<add Name="updateDate" />
<add Name="writerName" />
<add Name="path" />
<add Name="nodeTypeAlias" />
<add Name="parentID" />
<add Name="urlName" />
</IndexAttributeFields>
<IndexUserFields>
<add Name="bodyText" />
<add Name="pageTitle" />
<add Name="navTitle" />
<add Name="programID" />
<add Name="bookID" />
</IndexUserFields>
<IncludeNodeTypes>
</IncludeNodeTypes>
<ExcludeNodeTypes>
<add Name="doc_cleBooks_item" />
<add Name="doc_cleBooks_referralContainer" />
<add Name="doc_cleBooks_referralContainer_item" />
<add Name="doc_clePrograms_item" />
<add Name="doc_cle_programs_location_container" />
<add Name="doc_cle_programs_location_container_item" />
<add Name="doc_cle_programs_override_container" />
<add Name="doc_cle_programs_override_container_item" />
<add Name="doc_clePrograms_referralContainer" />
<add Name="doc_clePrograms_referralContainer_item" />
<add Name="doc_clePrograms_video_pilot_container" />
<add Name="doc_clePrograms_video_pilot_container_item" />
<add Name="doc_clePrograms_video_pilot_container_item_videoSegment" />
<add Name="doc_featured_container" />
<add Name="doc_featured_container_item" />
<add Name="doc_featured_container_item_contentpicker" />
<add Name="doc_featured_container_item_mediapicker" />
<add Name="doc_featured_container_item_rss" />
<add Name="doc_featured_rss_container" />
<add Name="doc_featured_scroller_container" />
<add Name="doc_featured_scroller_container_item" />
<add Name="doc_featured_scroller_container_item_contentPicker" />
<add Name="doc_featured_scroller_container_item_mediaPicker" />
<add Name="doc_featured_video_container" />
<add Name="doc_featured_container" />
<add Name="doc_featured_container_item" />
<add Name="doc_jpe_directory_item" />
<add Name="doc_featured_container_item" />
<add Name="doc_govAffairs_billTrack_item" />
<add Name="doc_inMemorium_item" />
<add Name="doc_judicialDirectory_item" />
<add Name="doc_lawyerLegislators_item" />
<add Name="doc_testimonial_Item" />
<add Name="doc_user_comment_and_rating" />
<add Name="doc_user_comment" />
<add Name="User Comment Basic" />
<add Name="doc_webpage_category" />
</ExcludeNodeTypes>
</IndexSet>
<IndexSet SetName="PublicMediaIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/PublicMediaIndexSet/">
<IndexAttributeFields>
<add Name="id" />
<add Name="nodeName" />
<add Name="updateDate" />
<add Name="writerName" />
<add Name="path" />
<add Name="nodeTypeAlias" />
<add Name="parentID" />
</IndexAttributeFields>
<IncludeNodeTypes>
<add Name="File" />
</IncludeNodeTypes>
</IndexSet>
</ExamineLuceneIndexSets>
what happens when add new media item does that end up in the index?
no... it tries to accessing the index but nothing is added. the index also appears to be missing files (when comparing indexes in luke)...
I've tried creating new indexes 3 or 4 times now and they all look like above.
what version of umbraco are you using?
Umbraco 4.9
can u jump on skype my id is ismail_mayat
unfortunately I believe my work blocks skype...
do the other indexes build or is just this one thats not rebuilding?
yes. they build and are searchable.
Hmm Ok stop the app pool. Delete the Index folder in PublicMediaIndexSet then start the app pool. See if you get anything then. If that dont work then paste screenshot of the log table again but this time filtered for Error only.
Regards
Isamil
ok. nothing specific to Examine/Lucene in the error log. Also same empty folder structure created again.
Ismail - I've located the problem. it's the FileTextContent field. it appears to be crashing our the index handler. Any idea on how to deal with this field and store it in another field? I've tried trapping it in the handler to no avail...
public void GatheringMediaNodeDataHandler(object sender, IndexingNodeDataEventArgs e)
{
//get rid of commas so field data is searchable
e.Fields.Add("searchablePath", Convert.ToString(e.Fields["path"]).Replace(",", " "));
//e.Fields.Add("bodyText", e.Fields["FileTextContent"].Replace("\n", " ").Replace("\r", " "));
e.Fields.Add("pageTitle", Convert.ToString(e.Fields["nodeName"]));
e.Fields.Add("navTitle", Convert.ToString(e.Fields["nodeName"]));
}
John,
It would seem as though one or more of the files is crashing the indexing. So wrap the e.Fields.Add("bodyText", e.Fields["FileTextContent"].Replace("\n", " ").Replace("\r", " ")); in a try catch and in the catch log error with the media item its trying to process. This way it should all index minus filecontents of problamatic files however you will know which ones are the problem and why. Then we can look at how to get round it.
Regards
Ismail
Ismail - this is bizarre... added error trapping and no errors are reported but now FileTextContent is not storing anything. will delete all the mapping fields and start from scratch... also there were a ton of other fields being indexed... doesn't appear to be indexing those now...
in regards to the fields, I bet it's because I added the indexSet property and it's setting up the index based on the defined properties... still not FileTextContent indexed... I'm going to removing the inexSet property and try it again...
<add name="PublicMediaIndexer" type="CogUmbracoExamineMediaIndexer.MediaIndexer, CogUmbracoExamineMediaIndexer" indexSet="PublicMediaIndexSet" extensions=".pdf,.doc,.docx,.xsl,.ppt" umbracoFileProperty="umbracoFile" />
Ah you dont need those properties as its media index and the media indexer does not put them in. However you can still add them via gatheringnodedata
removed the IndexSet property from the config file, stopped/restarted app pool and website and then defined fields are still being used. this has to be cached somewhere. even tried changing the index location and it's still not working... any idea? thx!
starting from scratch and not inserting an indexSet value throws this error message in the error log:
15. <add name="PublicIndexerMedia" type="CogUmbracoExamineMediaIndexer.MediaIndexer, CogUmbracoExamineMediaIndexer" extensions=".pdf,.doc,.docx,.xsl,.ppt" umbracoFileProperty="umbracoFile" />
Error loading application startup handler: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.TypeInitializationException: The type initializer for 'Examine.ExamineManager' threw an exception. ---> System.Configuration.ConfigurationErrorsException: Value cannot be null. Parameter name: indexSet on LuceneExamineIndexer provider has not been set in configuration and/or the IndexerData property has not been explicitly set (C:\inetpub\wwwroot_sandbox\config\ExamineSettings.config line 15) ---> System.ArgumentNullException: Value cannot be null. Parameter name: indexSet on LuceneExamineIndexer provider has not been set in configuration and/or the IndexerData property has not been explicitly set at Examine.LuceneEngine.Providers.LuceneIndexer.Initialize(String name, NameValueCollection config) at UmbracoExamine.BaseUmbracoIndexer.Initialize(String name, NameValueCollection config) at CogUmbracoExamineMediaIndexer.MediaIndexer.Initialize(String name, NameValueCollection config) at System.Web.Configuration.ProvidersHelper.InstantiateProvider(ProviderSettings providerSettings, Type providerType) --- End of inner exception stack trace --- at System.Web.Configuration.ProvidersHelper.InstantiateProvider(ProviderSettings providerSettings, Type providerType) at System.Web.Configuration.ProvidersHelper.InstantiateProviders(ProviderSettingsCollection configProviders, ProviderCollection providers, Type providerType) at Examine.ExamineManager.LoadProviders() at Examine.ExamineManager..cctor() --- End of inner exception stack trace --- at UmbracoExamine.UmbracoEventManager..ctor() --- End of inner exception stack trace --- at System.RuntimeTypeHandle.CreateInstance(RuntimeType type, Boolean publicOnly, Boolean noCheck, Boolean& canBeCached, RuntimeMethodHandleInternal& ctor, Boolean& bNeedSecurityCheck) at System.RuntimeType.CreateInstanceSlow(Boolean publicOnly, Boolean skipCheckThis, Boolean fillCache) at System.RuntimeType.CreateInstanceDefaultCtor(Boolean publicOnly, Boolean skipVisibilityChecks, Boolean skipCheckThis, Boolean fillCache) at System.Activator.CreateInstance(Type type, Boolean nonPublic) at umbraco.businesslogic.ApplicationStartupHandler..cctor()
after I add the indexSet property, FileTextContent is still empty and not being indexed...
put indexset back. I was refereing to
Ismail - it won't index anything without any IndexAttributeFields added.
Ismail - incredibly bizarre but i went ahead and rolled everything out to our production environment, copied over the examine config settings exactly in all files, rebuilt the new indexes and everything is working... seems very strangle our sandbox environment went crazy like that...
is working on a reply...