I have searched forum and I may be being very dumb, but I can't find a definitive answer to this question:
If you create a completely custom, non-Umbraco-content indexer, using a configuration file as below, will it be rebuilt by the built-in scheduler, or do I have to write my own process for reindexing when I need to?
Thanks for that Ismail, I will give that a go if my current idea fails!
I have a new problem now; I am trying currently to hook it into an existing task queue that runs a separate console process to process tasks from across the system.
The problem is that I don't have an HttpContext going, so although I am nearly there, I am stymied by this error:
Reindex members failed - TypeInitializationException: The type initializer for 'Examine.ExamineManager' threw an exception.(UnauthorizedAccessException: Access to the path '~/App_Data/TEMP/ExamineIndexes/InternalMember/' is denied.)
I've set the permissions up as per a site where the task queue works fine for processing temp files, so I'm assuming this is because it can't resolve the tilde path; have you ever tried to run Examine from a separate process?
I'd really like to get this going as it a) ties in better with the rest of the application, b) is easier to secure, and c) I don't have to worry about script timeouts if the database gets too big.
The offending code is in the Examine package Projects\Examine\LuceneEngine\Config\IndexSet.cs:
public DirectoryInfo IndexDirectory
{
get
{
//TODO: Get this out of the index set. We need to use the Indexer's DataService to lookup the folder so it can be unit tested. Probably need DataServices on the searcher then too
//we need to de-couple the context
if (HttpContext.Current != null)
return new DirectoryInfo(HttpContext.Current.Server.MapPath(this.IndexPath));
else if (HostingEnvironment.ApplicationID != null)
return new DirectoryInfo(HostingEnvironment.MapPath(this.IndexPath));
else
return new DirectoryInfo(this.IndexPath);
}
}
...which basically means that you can either provide a ~/ app rooted path, OR a physical path, but that's it - you can't provide a custom resolver - which would be GREAT, if any Examine core devs read this ;)
I don't really like providing physical paths, but in order to limit this to your reindexing service, and therefore hopefully keeping the site itself working if moved, you can create a duplicate of your ExamineIndex.config, replace all paths in there with physical ones, and then using that instead in the config file for the console app.
Note that you do need the Umbraco indexes in the new file as well as your custom one as the manager always creates the default indexes. Or one, anyway, now it's working I've kind of lost the desire to look any more.
The way it works is the console app sits in the application root along with a configuration file that loads all required dependencies including the new physically rooted ExamineIndex.config; this can then be called via a System.Diagnostics.Process to start asynchronous processing from a web page - in which case it runs as the AppPool Identity - or you can run it from the Task Scheduler with an identity of your choosing.
Here is the console application config file; you can see it configuring required settings for my application and task queue, connection strings, everything needed to get Examine working, everything to make it look in the website /bin folder for support assemblies, and because I am indexing members, the system.web section to allow the membership framework to function when I load members to index. You can safely ignore this whole section unless you need to set up custom membership provides yourself.
Long post, but I'm hoping this will prove useful to someone!
There is already a proprietary queue system that is used for managing various other long winded events; there is a simple call to enqueue a task defined as a set of parameters, and then a scheduled task on the server runs a queue processing console app that processes any outstanding requests.
I was trying to include reindexing the custom members in that system so it was all dependent on one thing, and all the administration for scheduling was done from one place.
I was a bit premature above - it is actually not running yet; everything seems to be good, the indexer is loading members and everything, but for some reason the returned dataset is not being added to the Lucene index, no errors or anything - and it can write to the files because it's clearing out the index!
It works fine if I use your dashboard plugin to rebuild the index, so something wrong with the way I'm calling it from a console app.
Bit annoying, it's looking like I'll have to stop doing it the tricky way and just use your much simpler web service call so all the contexts are correct; spent far too much time on this already.
Ok no worries. I have just been in the guts of examine and looking at queueing method there as I am looking at brewing something search related hopefully more information once i have something working.
Cool, let me know, this is the first time I've used Examine and I'm keen to use it more.
I've actually now solved my problem - after looking in the source to see where RebuildIndex() went I realised I needed to add runAsync="false" to my custom provider in ExamineSettings.config.
I presume this is because the async stuff uses something within Umbraco that my console app does not have available.
I think enabling it will massively slow down the app starting up, because it looks like Examine rebuilds all indexes on application start, so I think I'll create a duplicate of ExamineSettings like I did for the index sets just for the console app.
Thanks for your help!
EDIT: I have done the above and everything works perfectly now, so it is possible to use Examine outside of Umbraco and any web context when you know how!
Examine Custom Index Scheduling
I have searched forum and I may be being very dumb, but I can't find a definitive answer to this question:
If you create a completely custom, non-Umbraco-content indexer, using a configuration file as below, will it be rebuilt by the built-in scheduler, or do I have to write my own process for reindexing when I need to?
Or, in other words, can I ensure my index is regularly rebuilt simply by adding:
...to the above settings?
Rob,
I created a console application ran called webservice that made the reindex call then ran that using windows scheduler,
the above code is the http handler better to do it with web api or the like.
Regards
Ismail
Thanks for that Ismail, I will give that a go if my current idea fails!
I have a new problem now; I am trying currently to hook it into an existing task queue that runs a separate console process to process tasks from across the system.
The problem is that I don't have an HttpContext going, so although I am nearly there, I am stymied by this error:
Reindex members failed - TypeInitializationException: The type initializer for 'Examine.ExamineManager' threw an exception.(UnauthorizedAccessException: Access to the path '~/App_Data/TEMP/ExamineIndexes/InternalMember/' is denied.)
I've set the permissions up as per a site where the task queue works fine for processing temp files, so I'm assuming this is because it can't resolve the tilde path; have you ever tried to run Examine from a separate process?
I'd really like to get this going as it a) ties in better with the rest of the application, b) is easier to secure, and c) I don't have to worry about script timeouts if the database gets too big.
Right, I think I'm getting somewhere!
The offending code is in the Examine package
Projects\Examine\LuceneEngine\Config\IndexSet.cs
:...which basically means that you can either provide a ~/ app rooted path, OR a physical path, but that's it - you can't provide a custom resolver - which would be GREAT, if any Examine core devs read this ;)
I don't really like providing physical paths, but in order to limit this to your reindexing service, and therefore hopefully keeping the site itself working if moved, you can create a duplicate of your ExamineIndex.config, replace all paths in there with physical ones, and then using that instead in the config file for the console app.
Note that you do need the Umbraco indexes in the new file as well as your custom one as the manager always creates the default indexes. Or one, anyway, now it's working I've kind of lost the desire to look any more.
The way it works is the console app sits in the application root along with a configuration file that loads all required dependencies including the new physically rooted ExamineIndex.config; this can then be called via a
System.Diagnostics.Process
to start asynchronous processing from a web page - in which case it runs as the AppPool Identity - or you can run it from the Task Scheduler with an identity of your choosing.Here is the console application config file; you can see it configuring required settings for my application and task queue, connection strings, everything needed to get Examine working, everything to make it look in the website
/bin
folder for support assemblies, and because I am indexing members, the system.web section to allow the membership framework to function when I load members to index. You can safely ignore this whole section unless you need to set up custom membership provides yourself.Long post, but I'm hoping this will prove useful to someone!
Rob,
When you say "I am trying currently to hook it into an existing task queue" what do you mean exactly?
Regards
Ismail
There is already a proprietary queue system that is used for managing various other long winded events; there is a simple call to enqueue a task defined as a set of parameters, and then a scheduled task on the server runs a queue processing console app that processes any outstanding requests.
I was trying to include reindexing the custom members in that system so it was all dependent on one thing, and all the administration for scheduling was done from one place.
I was a bit premature above - it is actually not running yet; everything seems to be good, the indexer is loading members and everything, but for some reason the returned dataset is not being added to the Lucene index, no errors or anything - and it can write to the files because it's clearing out the index!
It works fine if I use your dashboard plugin to rebuild the index, so something wrong with the way I'm calling it from a console app.
Bit annoying, it's looking like I'll have to stop doing it the tricky way and just use your much simpler web service call so all the contexts are correct; spent far too much time on this already.
Rob,
Ok no worries. I have just been in the guts of examine and looking at queueing method there as I am looking at brewing something search related hopefully more information once i have something working.
Regards
Ismail
Cool, let me know, this is the first time I've used Examine and I'm keen to use it more.
I've actually now solved my problem - after looking in the source to see where
RebuildIndex()
went I realised I needed to addrunAsync="false"
to my custom provider inExamineSettings.config
.I presume this is because the async stuff uses something within Umbraco that my console app does not have available.
I think enabling it will massively slow down the app starting up, because it looks like Examine rebuilds all indexes on application start, so I think I'll create a duplicate of ExamineSettings like I did for the index sets just for the console app.
Thanks for your help!
EDIT: I have done the above and everything works perfectly now, so it is possible to use Examine outside of Umbraco and any web context when you know how!
is working on a reply...