Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Rob Watkins 369 posts 701 karma points
    Mar 21, 2014 @ 11:32
    Rob Watkins
    0

    Examine Custom Index Scheduling

    I have searched forum and I may be being very dumb, but I can't find a definitive answer to this question:

    If you create a completely custom, non-Umbraco-content indexer, using a configuration file as below, will it be rebuilt by the built-in scheduler, or do I have to write my own process for reindexing when I need to?

    <add name="UFCMemberIndexer"
        type="Examine.LuceneEngine.Providers.SimpleDataIndexer, Examine"
        dataService="UFC.Indexing.UFCMemberFinderDataService, UFC"
        analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"
        indexTypes="CustomData"
        indexSet="UFCMemberSearcherIndexSet" />
    

    Or, in other words, can I ensure my index is regularly rebuilt simply by adding:

    interval="600"
    runAsync="true"
    

    ...to the above settings?

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Mar 21, 2014 @ 11:45
    Ismail Mayat
    0

    Rob,

    I created a console application ran called webservice that made the reindex call then ran that using windows scheduler,

    public class RebuildIndexes : IHttpHandler
    {
        readonly List<string> indexes = new List<string> { "index1", "InternalIndexer", "sqlIndexer" };
        public void ProcessRequest(HttpContext context)
        {
            context.Response.ContentType = "text/plain";
            try
            {
                if(string.IsNullOrEmpty(context.Request.QueryString["index"]))
                {
                    foreach (var index in indexes)
                    {
                        ExamineManager.Instance.IndexProviderCollection[index].RebuildIndex();
                    }
    
                }
                else
                {
                    ExamineManager.Instance.IndexProviderCollection[context.Request.QueryString["index"]].RebuildIndex();
                }
                context.Response.Write("done");
            }
            catch(Exception ex)
            {
                context.Response.Write(ex.ToString());
            }
        }
    
        public bool IsReusable
        {
            get
            {
                return false;
            }
        }
    }
    

    the above code is the http handler better to do it with web api or the like.

    Regards

    Ismail

  • Rob Watkins 369 posts 701 karma points
    Mar 21, 2014 @ 13:41
    Rob Watkins
    0

    Thanks for that Ismail, I will give that a go if my current idea fails!

    I have a new problem now; I am trying currently to hook it into an existing task queue that runs a separate console process to process tasks from across the system.

    The problem is that I don't have an HttpContext going, so although I am nearly there, I am stymied by this error:

    Reindex members failed - TypeInitializationException: The type initializer for 'Examine.ExamineManager' threw an exception.(UnauthorizedAccessException: Access to the path '~/App_Data/TEMP/ExamineIndexes/InternalMember/' is denied.)

    I've set the permissions up as per a site where the task queue works fine for processing temp files, so I'm assuming this is because it can't resolve the tilde path; have you ever tried to run Examine from a separate process?

    I'd really like to get this going as it a) ties in better with the rest of the application, b) is easier to secure, and c) I don't have to worry about script timeouts if the database gets too big.

  • Rob Watkins 369 posts 701 karma points
    Mar 21, 2014 @ 16:21
    Rob Watkins
    100

    Right, I think I'm getting somewhere!

    The offending code is in the Examine package Projects\Examine\LuceneEngine\Config\IndexSet.cs:

    public DirectoryInfo IndexDirectory
    {
        get
        {
            //TODO: Get this out of the index set. We need to use the Indexer's DataService to lookup the folder so it can be unit tested. Probably need DataServices on the searcher then too
    
            //we need to de-couple the context
            if (HttpContext.Current != null)
                return new DirectoryInfo(HttpContext.Current.Server.MapPath(this.IndexPath));
            else if (HostingEnvironment.ApplicationID != null)
                return new DirectoryInfo(HostingEnvironment.MapPath(this.IndexPath));
            else
                return new DirectoryInfo(this.IndexPath);
        }
    }
    

    ...which basically means that you can either provide a ~/ app rooted path, OR a physical path, but that's it - you can't provide a custom resolver - which would be GREAT, if any Examine core devs read this ;)

    I don't really like providing physical paths, but in order to limit this to your reindexing service, and therefore hopefully keeping the site itself working if moved, you can create a duplicate of your ExamineIndex.config, replace all paths in there with physical ones, and then using that instead in the config file for the console app.

    Note that you do need the Umbraco indexes in the new file as well as your custom one as the manager always creates the default indexes. Or one, anyway, now it's working I've kind of lost the desire to look any more.

    The way it works is the console app sits in the application root along with a configuration file that loads all required dependencies including the new physically rooted ExamineIndex.config; this can then be called via a System.Diagnostics.Process to start asynchronous processing from a web page - in which case it runs as the AppPool Identity - or you can run it from the Task Scheduler with an identity of your choosing.

    Here is the console application config file; you can see it configuring required settings for my application and task queue, connection strings, everything needed to get Examine working, everything to make it look in the website /bin folder for support assemblies, and because I am indexing members, the system.web section to allow the membership framework to function when I load members to index. You can safely ignore this whole section unless you need to set up custom membership provides yourself.

    Long post, but I'm hoping this will prove useful to someone!

    <?xml version="1.0" encoding="utf-8"?>
    <configuration>
    
        <configSections>
            <section name="MyApplicationSettings" type="MyApplication.Settings, MyApplication" />
            <section name="ActionQueueSettings" type="cFront.Scheduling.ActionQueueSettings, cfActionQueue" />
            <section name="Examine" type="Examine.Config.ExamineSettings, Examine" requirePermission="false" />
            <section name="ExamineLuceneIndexSets" type="UmbracoExamine.Config.ExamineLuceneIndexes, UmbracoExamine" requirePermission="false" />
        </configSections>
    
        <MyApplicationSettings configSource="config\MyApplicationSettings.config" />    
        <ActionQueueSettings configSource="config\ActionQueueSettings.config" />
    
        <Examine configSource="config\ExamineSettings.config" />
        <ExamineLuceneIndexSets configSource="config\ExamineIndex_QUEUE_PROCESS.config" />
    
        <appSettings configSource="config\appSettings.config" />
    
        <connectionStrings>
            <add name="AppMainConnection" connectionString="YOUR CONNECTION STRING" />
        </connectionStrings>
    
        <runtime>
            <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
                <probing privatePath="bin"/>
            </assemblyBinding>
        </runtime>
    
        <system.web>
            <!-- Membership Provider - Simplified user table with bcrypt hashing -->
            <membership defaultProvider="BCryptMembershipProvider" userIsOnlineTimeWindow="15">
                <providers>
                    <clear />
                    <add name="BCryptMembershipProvider" type="cFront.Web.Security.BCryptMembershipProvider, cfBCryptMembershipProvider" useBuiltInLogging="1" connectionStringName="AppMainConnection" />
                </providers>
            </membership>
            <!-- Roles Provider - Simplified user table with bcrypt hashing -->
            <roleManager enabled="true" defaultProvider="BCryptRoleProvider">
                <providers>
                    <clear />
                    <add name="BCryptRoleProvider" type="cFront.Web.Security.BCryptRoleProvider, cfBCryptMembershipProvider" connectionStringName="AppMainConnection" availableRoles="SuperUser,Administrator,User" />
                </providers>
            </roleManager>
    
        </system.web>
    </configuration>
    
  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Mar 21, 2014 @ 16:58
    Ismail Mayat
    0

    Rob,

    When you say "I am trying currently to hook it into an existing task queue" what do you mean exactly?

    Regards

    Ismail

  • Rob Watkins 369 posts 701 karma points
    Mar 21, 2014 @ 17:04
    Rob Watkins
    0

    There is already a proprietary queue system that is used for managing various other long winded events; there is a simple call to enqueue a task defined as a set of parameters, and then a scheduled task on the server runs a queue processing console app that processes any outstanding requests.

    I was trying to include reindexing the custom members in that system so it was all dependent on one thing, and all the administration for scheduling was done from one place.

    I was a bit premature above - it is actually not running yet; everything seems to be good, the indexer is loading members and everything, but for some reason the returned dataset is not being added to the Lucene index, no errors or anything - and it can write to the files because it's clearing out the index!

    It works fine if I use your dashboard plugin to rebuild the index, so something wrong with the way I'm calling it from a console app.

    Bit annoying, it's looking like I'll have to stop doing it the tricky way and just use your much simpler web service call so all the contexts are correct; spent far too much time on this already.

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Mar 21, 2014 @ 17:12
    Ismail Mayat
    0

    Rob,

    Ok no worries. I have just been in the guts of examine and looking at queueing method there as I am looking at brewing something search related hopefully more information once i have something working.

    Regards

    Ismail

  • Rob Watkins 369 posts 701 karma points
    Mar 21, 2014 @ 17:31
    Rob Watkins
    0

    Cool, let me know, this is the first time I've used Examine and I'm keen to use it more.

    I've actually now solved my problem - after looking in the source to see where RebuildIndex() went I realised I needed to add runAsync="false" to my custom provider in ExamineSettings.config.

    I presume this is because the async stuff uses something within Umbraco that my console app does not have available.

    I think enabling it will massively slow down the app starting up, because it looks like Examine rebuilds all indexes on application start, so I think I'll create a duplicate of ExamineSettings like I did for the index sets just for the console app.

    Thanks for your help!

    EDIT: I have done the above and everything works perfectly now, so it is possible to use Examine outside of Umbraco and any web context when you know how!

Please Sign in or register to post replies

Write your reply to:

Draft