Auto index Examine fields from custom data source, using Lucene spatial search
Hi,
I'm implementing a 'nearest neighbour' facility using Lucene spatial search. I have a custom database table containing properties (as in, physical properties like hotels rather than Umbraco properties) with name, description, latitude and longitude fields which need to be injected into the Examine index along with the Cartesian Tier fields from the Lucene spatial utility. Note, only data from the custom table needs to be indexed, I'm not mixing in any Umbraco content. The code which does the indexing is as follows:
public class PropertyIndexDataService : ISimpleDataService
{
public const double KmsToMiles = 0.621371192;
public const double MaxKms = 5000 * KmsToMiles;
public const double MinKms = 1 * KmsToMiles;
private readonly List<CartesianTierPlotter> _ctps = new List<CartesianTierPlotter>();
private readonly IProjector _projector = new SinusoidalProjector();
public PropertyIndexDataService() { }
public IEnumerable<SimpleDataSet> GetAllData(string indexType)
{
CartesianTierPlotter ctp = new CartesianTierPlotter(0, _projector, CartesianTierPlotter.DefaltFieldPrefix);
//The starting tier (the largest grid square) calculated by providing the furthest distance in miles that we want to search
int startTier = ctp.BestFit(MaxKms);
//The last tier (the smallest grid square) calculated by providing the closest distance in miles that we want to search
int endTier = ctp.BestFit(MinKms);
for (int i = startTier; i <= endTier; i++)
{
_ctps.Add(new CartesianTierPlotter(i, _projector, CartesianTierPlotter.DefaltFieldPrefix));
}
List<SimpleDataSet> data = new List<SimpleDataSet>();
var db = ApplicationContext.Current.DatabaseContext.Database;
List<Property> Properties = db.Fetch<Property>("SELECT * FROM Property");
foreach (Property property in Properties)
{
var rowData = new Dictionary<string, string>();
rowData.Add("name", property.name);
rowData.Add("description", property.description);
for (int i = 0; i < _ctps.Count; i++)
{
CartesianTierPlotter plotter = _ctps[i];
//Calculate this tiers grid from the properties location
var boxId = plotter.GetTierBoxId(property.latitude, property.longitude);
//Add the tier data to the indexer
rowData.Add(plotter.GetTierFieldName(), NumericUtils.DoubleToPrefixCoded(boxId));
}
data.Add(new SimpleDataSet()
{
NodeDefinition = new IndexedNode()
{
NodeId = property.id,
Type = "CustomData"
},
RowData = rowData
});
}
return data;
}
}
It seems like the correct values are being generated but nothing shows up in the Examine index unless I manually add all fields into IndexUserFields in the ExamineIndex.config. I was under the impression that if no IndexUserFields were specified, it would index everything.
I don't fully understand how the Cartesian Plotters work so I'm not particularly comfortable in adding the tier fields manually in the config. I'd rather it just index everything without any configuration.
All of the examples I can find which cover this topic (e.g. this blog article) suggest using the low level Lucene document writing events to populate the index, but they all involve indexing Umbraco content which is a different kind of implementation to indexing custom content - I can't reconcile the two approaches.
Could anyone suggest how I can adapt my logic to write all data to the index without requiring manual configuration?
When building custom indexer you have to ensure fields are added to the config under IndexUserFields else they will not be in the index. I have this working on sample that I run through on the examine course and this is for custom db table which has longitude and latitude my config looks like:
I do not add the tiers fields in config and it works. Now the difference between mine and yours is you are doing the addition of tiers in the indexer i add mine using document writing event see this gist https://gist.github.com/ismailmayat/3902c660527c8b3d20b38ae724ab9892
Just one thing how often does the data change? When you add new data or update you will need to rebuild index, unless you handle at point of change and add update individual row
Hi guys, I'm facing a very strange behaviour where when I set radius to 10 km I get only 2 results but when I change the radius to 15 km I get 3 results, have you faced it?
Yes and everyone in less than 10 km.
I had places showing for 25 km and disappearing for 50 km... very strange and inconsistent behavior I must say.
My solution was to get all the places using Examine and filter them by myself.
Auto index Examine fields from custom data source, using Lucene spatial search
Hi,
I'm implementing a 'nearest neighbour' facility using Lucene spatial search. I have a custom database table containing properties (as in, physical properties like hotels rather than Umbraco properties) with name, description, latitude and longitude fields which need to be injected into the Examine index along with the Cartesian Tier fields from the Lucene spatial utility. Note, only data from the custom table needs to be indexed, I'm not mixing in any Umbraco content. The code which does the indexing is as follows:
It seems like the correct values are being generated but nothing shows up in the Examine index unless I manually add all fields into IndexUserFields in the ExamineIndex.config. I was under the impression that if no IndexUserFields were specified, it would index everything.
So the following yields no data in the index:
... but the following results in a fully populated index:
I don't fully understand how the Cartesian Plotters work so I'm not particularly comfortable in adding the tier fields manually in the config. I'd rather it just index everything without any configuration.
All of the examples I can find which cover this topic (e.g. this blog article) suggest using the low level Lucene document writing events to populate the index, but they all involve indexing Umbraco content which is a different kind of implementation to indexing custom content - I can't reconcile the two approaches.
Could anyone suggest how I can adapt my logic to write all data to the index without requiring manual configuration?
Many thanks.
Dan,
When building custom indexer you have to ensure fields are added to the config under IndexUserFields else they will not be in the index. I have this working on sample that I run through on the examine course and this is for custom db table which has longitude and latitude my config looks like:
I do not add the tiers fields in config and it works. Now the difference between mine and yours is you are doing the addition of tiers in the indexer i add mine using document writing event see this gist https://gist.github.com/ismailmayat/3902c660527c8b3d20b38ae724ab9892
That has really connected the dots in my understanding and seems to be working nicely, thanks Ismail!
Just one thing how often does the data change? When you add new data or update you will need to rebuild index, unless you handle at point of change and add update individual row
Hi guys, I'm facing a very strange behaviour where when I set radius to 10 km I get only 2 results but when I change the radius to 15 km I get 3 results, have you faced it?
If you increase the radius i would expect to see more results. The 3 results you see are you writing out their distances?
Regards
Ismail
Yes and everyone in less than 10 km. I had places showing for 25 km and disappearing for 50 km... very strange and inconsistent behavior I must say. My solution was to get all the places using Examine and filter them by myself.
Dimitry,
Try rebuilding the index then try the search. I suspect its something to do with each time a new item is added.
Regards
Ismail
Tried that :( nothing changed.
is working on a reply...