Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Dan 1288 posts 3921 karma points c-trib
    Nov 12, 2017 @ 14:02
    Dan
    0

    Auto index Examine fields from custom data source, using Lucene spatial search

    Hi,

    I'm implementing a 'nearest neighbour' facility using Lucene spatial search. I have a custom database table containing properties (as in, physical properties like hotels rather than Umbraco properties) with name, description, latitude and longitude fields which need to be injected into the Examine index along with the Cartesian Tier fields from the Lucene spatial utility. Note, only data from the custom table needs to be indexed, I'm not mixing in any Umbraco content. The code which does the indexing is as follows:

    public class PropertyIndexDataService : ISimpleDataService
    {
        public const double KmsToMiles = 0.621371192;
        public const double MaxKms = 5000 * KmsToMiles;
        public const double MinKms = 1 * KmsToMiles;
    
        private readonly List<CartesianTierPlotter> _ctps = new List<CartesianTierPlotter>();
        private readonly IProjector _projector = new SinusoidalProjector();
    
        public PropertyIndexDataService() { }
    
        public IEnumerable<SimpleDataSet> GetAllData(string indexType)
        {
            CartesianTierPlotter ctp = new CartesianTierPlotter(0, _projector, CartesianTierPlotter.DefaltFieldPrefix);
    
            //The starting tier (the largest grid square) calculated by providing the furthest distance in miles that we want to search
            int startTier = ctp.BestFit(MaxKms);
    
            //The last tier (the smallest grid square) calculated by providing the closest distance in miles that we want to search
            int endTier = ctp.BestFit(MinKms);
    
            for (int i = startTier; i <= endTier; i++)
            {
                _ctps.Add(new CartesianTierPlotter(i, _projector, CartesianTierPlotter.DefaltFieldPrefix));
            }
    
            List<SimpleDataSet> data = new List<SimpleDataSet>();
    
            var db = ApplicationContext.Current.DatabaseContext.Database;
    
            List<Property> Properties = db.Fetch<Property>("SELECT * FROM Property");
    
            foreach (Property property in Properties)
            {
                var rowData = new Dictionary<string, string>();
                rowData.Add("name", property.name);
                rowData.Add("description", property.description);
    
                for (int i = 0; i < _ctps.Count; i++)
                {
                    CartesianTierPlotter plotter = _ctps[i];
    
                    //Calculate this tiers grid from the properties location
                    var boxId = plotter.GetTierBoxId(property.latitude, property.longitude);
    
                    //Add the tier data to the indexer
                    rowData.Add(plotter.GetTierFieldName(), NumericUtils.DoubleToPrefixCoded(boxId));
                }
    
                data.Add(new SimpleDataSet()
                {
                    NodeDefinition = new IndexedNode()
                    {
                        NodeId = property.id,
                        Type = "CustomData"
                    },
                    RowData = rowData
                });
            }
            return data;
        }
    }
    

    It seems like the correct values are being generated but nothing shows up in the Examine index unless I manually add all fields into IndexUserFields in the ExamineIndex.config. I was under the impression that if no IndexUserFields were specified, it would index everything.

    So the following yields no data in the index:

      <IndexSet SetName="PropertyIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/PropertyIndexSet" />
    

    ... but the following results in a fully populated index:

    <IndexSet SetName="PropertyIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/PropertyIndexSet">
      <IndexUserFields>
        <add Name="name"/>
        <add Name="description"/>
        <add Name="_tier_3"/>
        <add Name="_tier_4"/>
        <add Name="_tier_5"/>
        <add Name="_tier_6"/>
        <add Name="_tier_7"/>
        <add Name="_tier_8"/>
        <add Name="_tier_9"/>
        <add Name="_tier_10"/>
        <add Name="_tier_11"/>
        <add Name="_tier_12"/>
        <add Name="_tier_13"/>
        <add Name="_tier_14"/>
        <add Name="_tier_15"/>
      </IndexUserFields>
    </IndexSet>
    

    I don't fully understand how the Cartesian Plotters work so I'm not particularly comfortable in adding the tier fields manually in the config. I'd rather it just index everything without any configuration.

    All of the examples I can find which cover this topic (e.g. this blog article) suggest using the low level Lucene document writing events to populate the index, but they all involve indexing Umbraco content which is a different kind of implementation to indexing custom content - I can't reconcile the two approaches.

    Could anyone suggest how I can adapt my logic to write all data to the index without requiring manual configuration?

    Many thanks.

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Nov 13, 2017 @ 17:36
    Ismail Mayat
    101

    Dan,

    When building custom indexer you have to ensure fields are added to the config under IndexUserFields else they will not be in the index. I have this working on sample that I run through on the examine course and this is for custom db table which has longitude and latitude my config looks like:

        <IndexUserFields>
      <add Name="name" EnableSorting="true"/>
      <add Name="county" EnableSorting="true"/>
      <add Name="country"/>
      <add Name="grid_reference" />
      <add Name="latitude" />
      <add Name="longitude"/>
      <add Name="postcode_sector"/>
    </IndexUserFields>
    

    I do not add the tiers fields in config and it works. Now the difference between mine and yours is you are doing the addition of tiers in the indexer i add mine using document writing event see this gist https://gist.github.com/ismailmayat/3902c660527c8b3d20b38ae724ab9892

  • Dan 1288 posts 3921 karma points c-trib
    Nov 13, 2017 @ 22:34
    Dan
    1

    That has really connected the dots in my understanding and seems to be working nicely, thanks Ismail!

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Nov 14, 2017 @ 07:32
    Ismail Mayat
    0

    Just one thing how often does the data change? When you add new data or update you will need to rebuild index, unless you handle at point of change and add update individual row

  • Dmitry Morlender 19 posts 100 karma points
    Jul 24, 2018 @ 16:36
    Dmitry Morlender
    0

    Hi guys, I'm facing a very strange behaviour where when I set radius to 10 km I get only 2 results but when I change the radius to 15 km I get 3 results, have you faced it?

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Jul 26, 2018 @ 08:01
    Ismail Mayat
    0

    If you increase the radius i would expect to see more results. The 3 results you see are you writing out their distances?

    Regards

    Ismail

  • Dmitry Morlender 19 posts 100 karma points
    Jul 26, 2018 @ 11:27
    Dmitry Morlender
    0

    Yes and everyone in less than 10 km. I had places showing for 25 km and disappearing for 50 km... very strange and inconsistent behavior I must say. My solution was to get all the places using Examine and filter them by myself.

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Jul 26, 2018 @ 12:15
    Ismail Mayat
    0

    Dimitry,

    Try rebuilding the index then try the search. I suspect its something to do with each time a new item is added.

    Regards

    Ismail

  • Dmitry Morlender 19 posts 100 karma points
    Jul 26, 2018 @ 13:14
    Dmitry Morlender
    0

    Tried that :( nothing changed.

Please Sign in or register to post replies

Write your reply to:

Draft