Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Peter Bridger 11 posts 52 karma points
    Aug 17, 2011 @ 14:58
    Peter Bridger
    1

    Lucene with spatial.net

    Thanks to Ismail's answer on StackOverflow regarding searching Lucene with spatial queries (Geosearching).

    However I'm brand new to Lucene, so I've ordered Lucene in Action in order to learn more about the underlying technology and how it works.

    I've rigged up a simple search, using the walkthrough on Umbraco.TV with the hopes of extending this to integrate Spatial.net - however on the surface the way it's performing this search is completely different to the way the Spatial.net addition for Luene.net works.

    Does anyone have some hands on experience on creating a datatype in Umbraco that stores spatial data which allows it to be displayed in the front/backend, whilst allowing Lucene to search it using the Spatial.net package?

    If not it looks like I'm going to have my hands dirty with some heavy reading! :)

    Thanks
    Pete

  • Lee Kelleher 3876 posts 14590 karma points MVP 9x admin c-trib
    Aug 18, 2011 @ 10:38
    Lee Kelleher
    0

    Hi Pete,

    I'm no expert in spatial data, (I haven't looked at Spatial.net yet either) - but I'm curious how it could work with Umbraco - defintely worth exploring.

    From the data-types perspective, do you know if it's possible to serialize spatial data? e.g. "LINESTRING" & "POLYGON" co-ords? (PostGIS, GML2/GML3 - I googled these)

    I googled around about this and found the SqlGeometryBuilder and SqlGeometry objects. If that didn't cause much overhead, then would be possible to store serialized data as Ntext, with the data-type's render control to deseralize it.

    Just throwing ideas around...

    Cheers, Lee.

  • Peter Bridger 11 posts 52 karma points
    Sep 05, 2011 @ 13:23
    Peter Bridger
    1

    Thanks for your thoughts Lee

    In the end I decided to store the data in a dedicated table in SQL Server 2008, configured to store spatial data. This way I can easily use the powerful spatial search features built in.

    I've also rigged up an event handlers so that whenever a document is added/saved/deleted it'll automatically extract the spatial data and pass into the spatial table in SQL Server.

    Pete

  • Ismail Mayat 4288 posts 9247 karma points MVP 2x admin c-trib
    Nov 03, 2011 @ 11:41
    Ismail Mayat
    0

    Guys,

    Found this test class in spatial.net which covers how to index and search using spatial https://svn.apache.org/repos/asf/incubator/lucene.net/tags/Lucene.Net_2_9_1/contrib/Spatial.Net/Tests/TestCartesian.cs. ; So you could use the googlemaps datatype to retrieve the geo codes for you and store those in the document in umbraco. Then on publish tap into gatheringnode event of examine and then index according to the code in the test class that will give you data in the index so that you can filter and sort on it.

    Regards

     

    Ismail

  • Drew Garratt 44 posts 192 karma points
    Mar 26, 2013 @ 15:53
    Drew Garratt
    0

    Hi Guys,

    I don't know if anyone is still looking at this but I'd rather not post a new forum thread if this lies unsolved.

    I'm looking to implement spatial searching using lucence to plug in to a existing Examine based search routine. I've already built a gather event to index node realtionship data into our index and we already have Google providing lat lang data that is stored against our nodes.

    The reading I've done around this however has really had me scratching my head. The best walkthough example of the process I've found is here http://www.mhaller.de/archives/156-Spatial-search-with-Lucene.html

    But taking it from a java context to something we'd write in a gathering event and indeed what the eventual raw luecne query would look like is still proving tricky.

    The only other issue I can see is that to take advantage of Lucence.net spatial https://nuget.org/packages/Lucene.Net.Contrib.Spatial is that I'll have to recompile Umbraco Examine with the latest version of Lucene.net

    Has anyone else actually tried all this before?

     

     

  • Ismail Mayat 4288 posts 9247 karma points MVP 2x admin c-trib
    Mar 26, 2013 @ 16:10
    Ismail Mayat
    0

    drew,

    spatial works with 2.9.1 and examine is 2.9.4 so it should work without having to rebuild examine?

    Regards

    Ismail

  • Drew Garratt 44 posts 192 karma points
    Mar 26, 2013 @ 16:17
    Drew Garratt
    0

    Hi Ismail,

    That sounds like a much less troublesome option. Via NuGet the https://nuget.org/packages/Lucene.Net.Contrib.Spatial/3.0.3 seems to call for a dependancy of Lucene.Net > 3.0.3

    Is there a older version of Lucene.Net.Contrib.Spatial out there or am I simply looking at the wrong Spatial library?

    Thanks =)

  • Ismail Mayat 4288 posts 9247 karma points MVP 2x admin c-trib
    Mar 26, 2013 @ 17:22
  • Drew Garratt 44 posts 192 karma points
    May 07, 2013 @ 12:20
    Drew Garratt
    1

    Hi Ismail,

    Had to down tools on this problem to concentrate on the gathering method for geolocation. But in the intervining time it would see that Gary H over at Leaping Gorilla has given us a tantalising walkthough of putting this into action.

    See here http://leapinggorilla.com/Blog/Read/1010/spatial-search-in-lucenenet---worked-example

    Along with a pretty descriptive walkthough he also helped with one of the more basic steps I had struggled with, just pulling in all the necessary librarys. For those also looking the following NuGet will solve that little problem.

    Install-Package Lucene.Net -Version 2.9.4.1
    Install-Package Lucene.Net.Contrib -Version 2.9.4.1

    My remaining head scratch is around how to implement this within a ExamineManager GatheringNodeData event.

    It's easy enough to grab the fields containing our Lat and Long and create our Cartisan Plotter. But once the plotter data is created whats the best way to pass it into IndexingNodeDataEventArgs?

    Obviously will pop up my code as I progress =)

  • Ismail Mayat 4288 posts 9247 karma points MVP 2x admin c-trib
    May 07, 2013 @ 12:31
    Ismail Mayat
    0

    Drew,

    Excellent find on that article. With regards to getting it into the index don't use gatheringnode data you need lower level lucene access use document writing event instead. This will give you lower level lucene document access see http://thecogworks.co.uk/blog/posts/2013/april/examiness-hints-and-tips-from-the-trenches-part-10-document-writing-redux/ so you could take the code from the worked example and use it in document writing event. Hit me up on skype if you got any questions ismail_mayat

    Regards

    Ismail

  • Drew Garratt 44 posts 192 karma points
    May 10, 2013 @ 11:33
    Drew Garratt
    1

    Hi Ismail,

    Think I might end up asking for your expert opinion but thought I should pop my progress up here, I do believe I'm half way there.

    Thanks to your article on thecogworks I was apple to add a second event handler to my examine extending class to give me lower level access to the index.

    So to begin with I defined the variables to set up the indexer. Defining my index and Maximum and Minimum radius my plotter dictionary and the location prefix for the indexed fields.

    private const string TheIndex = "TheIndexer";
    private const double MaxM = 5000;
    private const double MinM = 1; private static int _startTier; private static int _endTier; private static Dictionary<int, CartesianTierPlotter> Plotters { get; set; } public const string LocationTierPrefix = "LocationTierPrefix_";

    Next OnApplicationStarted I initiated my projector and built my location tiers

    IProjector projector = new SinusoidalProjector();
    var ctp = new CartesianTierPlotter(0, projector,LocationTierPrefix);
    _startTier = ctp.BestFit(MaxM);
    _endTier = ctp.BestFit(MinM);
    
    Plotters = new Dictionary<int, CartesianTierPlotter>();
    for (var tier = _startTier; tier <= _endTier; tier++)
    {
        Plotters.Add(tier, new CartesianTierPlotter(tier, projector, LocationTierPrefix));
    }
    
    var indexer = (UmbracoContentIndexer)ExamineManager.Instance.IndexProviderCollection[JobIndex];
    indexer.DocumentW

    Finally the event itself is fairly simple. Defing containers for our latitude and longitude values before testing for there pressense in the document being indexed.
    If they are pressent we work though our defined tiers using these lat long values. Encoding the lat and long values before adding tiers as we go.

    string _geolat;
    string _geolong;
    if (e.Fields.TryGetValue("geolat", out _geolat) && e.Fields.TryGetValue("geolong", out _geolong))
    {
        e.Document.Add(new Field("codedlat", NumericUtils.DoubleToPrefixCoded(Convert.ToDouble(_geolat)), Field.Store.YES, Field.Index.NOT_ANALYZED));
        e.Document.Add(new Field("codedlang", NumericUtils.DoubleToPrefixCoded(Convert.ToDouble(_geolong)), Field.Store.YES, Field.Index.NOT_ANALYZED));
        for (var tier = _startTier; tier <= _endTier; tier++)
        {
            var ctp = Plotters[tier];
            var boxId = ctp.GetTierBoxId(Convert.ToDouble(_geolat), Convert.ToDouble(_geolong));
            e.Document.Add(new Field(ctp.GetTierFieldName(),
                            NumericUtils.DoubleToPrefixCoded(boxId),
                            Field.Store.YES,
                            Field.Index.NOT_ANALYZED_NO_NORMS));
        }
    }

    This reasults in what I woudl expect to see in the index

    Top ranking terms

    ?RankFieldText
    1 2 LocationTierPrefix_11 ?ntC
    2 2 LocationTierPrefix_12 ?g?tC
    3 2 LocationTierPrefix_13 ?_?tC
    4 2 LocationTierPrefix_14 ?WO~mB(
    5 2 LocationTierPrefix_2 @Lf3Le
    6 2 LocationTierPrefix_10 ?urr#Q'
    7 2 LocationTierPrefix_4 @ua#kBGV
    8 2 LocationTierPrefix_5 @ua#kBGV
    9 2 LocationTierPrefix_6 @ua#kBGV
    10 2 LocationTierPrefix_7 @~|vd-
    11 2 LocationTierPrefix_8 ?>;2CJ
    12 2 LocationTierPrefix_9 ?{~|vd-
    13 2 LocationTierPrefix_3 @Lf3Le
    14 1 codedlat @%a533
    15 1 codedlang @=$2i.c

    now comes the tricky part of getting a BooleanQuery(); into my examine search <.<

  • Ismail Mayat 4288 posts 9247 karma points MVP 2x admin c-trib
    May 10, 2013 @ 11:45
    Ismail Mayat
    0

    Drew,

    What about the standard examine boolean or and operators do they not work?

    Regards

    Ismail

  • Drew Garratt 44 posts 192 karma points
    May 10, 2013 @ 12:04
    Drew Garratt
    0

    Hi Ismail,

    So far my first attempt at creating a searcher based on the examples I've looked at while not throwing a error isn't returning results (not a great sign)

    My first stumbling block was up until this point I'd used a stringbuilder that finally passed to criteria = criteria.RawQuery(stringbuilder) in order to search the index.

    Following Gorillas example I built a 

    var masterQuery = new BooleanQuery();

    From there I grabbed the coordinates from the query string and created a distance filter with a giant radius (just to try and capture results)

    if (!string.IsNullOrEmpty(geoTerm))
    {
        string[] coordinates = geoTerm.Split(',');
    
        double _lat = Convert.ToDouble(coordinates[0]);
        double _long = Convert.ToDouble(coordinates[1]);
        double _radius = 4000;
        /*  Builder allows us to build a polygon which we will use to limit  
        * search scope on our cartesian tiers, this is like putting a grid 
        * over a map */
        var builder = new CartesianPolyFilterBuilder(LocationTierPrefix);
    
        /*  Bounding area draws the polygon, this can be thought of as working  
        * out which squares of the grid over a map to search */
        var boundingArea = builder.GetBoundingArea(_lat, _long, _radius);
    
        /*  We refine, this is the equivalent of drawing a circle on the map,  
        *  within our grid squares, ignoring the parts the squares we are  
        *  searching that aren't within the circle - ignoring extraneous corners 
        *  and such */
        var distFilter = new LatLongDistanceFilter(boundingArea,
                            _radius,
                            _lat,
                            _long,
                            "codedlat",
                            "codedlong");
    
        /*  Add our filter, this will stream through our results and determine eligibility */
        masterQuery.Add(new ConstantScoreQuery(distFilter), BooleanClause.Occur.MUST);
    }

    Passing the masterQuery into SearchProviderCollection Search meant coverting that BooleanQuery into a ISearchCriteria by passing the RawQuery toString

    Examine.SearchCriteria.ISearchCriteria criteria = ExamineManager.Instance.CreateSearchCriteria(BooleanOperation.And);
    
    criteria = criteria.RawQuery(masterQuery.ToString());

    But so far no luck on getting results out the other end. There is quite allot going on in the middle here and I'm having trouble spotting what I've missed. I'd be more comftable if I wasn't dicing inbetween BooleanQuery and ISearchCriteria.

  • Drew Garratt 44 posts 192 karma points
    May 14, 2013 @ 10:51
    Drew Garratt
    0

    Morning,

    More progress although I'm not sure it's quite finished. Looking through a few Cogworks posts I spotted a example of bypassing examine all together in favour of using pure Lucece search. This looks as if it's going to be the better solution as there will be no need to pass or convert the constant score qurery.

       string indexRootPath = "~/App_Data/TEMP/ExamineIndexes/";
    
        // Define Inxed Name
        string indexName = "Test";
    
        string indexPath = indexRootPath + indexName + "/Index";
    
        var indexDirectory = FSDirectory.Open(new DirectoryInfo(HttpContext.Current.Server.MapPath(indexPath)));
    
        Lucene.Net.Search.IndexSearcher searcher = new
        Lucene.Net.Search.IndexSearcher(indexDirectory);
    
        // Get the searcher from examine
    
        Lucene.Net.Search.TopDocs results = searcher.Search(mainQuery, null, searcher.MaxDoc());
    }
    
    <div>
    @foreach (ScoreDoc scoreDoc in results.ScoreDocs)
    {
        Document doc = searcher.Doc(scoreDoc.doc);
        string myFieldVale = doc.Get("nodeName");
        <p>@myFieldVale</p>
    }
    </div>
    
    

    So essentially I created a few short strings to help define the index location, mapped the location and created a new Lucene index search based on that path.

    Running the earlier posted radius search though this search does indeed post good reasults, which is a significant step up from the null returned with examine.

    My remaining issues stem from erratic results when testing the distance of my search. At large distances results seem to be returned resonable accurately but at distances of sub 10 miles some results appear to be missing.

  • Drew Garratt 44 posts 192 karma points
    May 14, 2013 @ 12:41
    Drew Garratt
    0

    One day I am certain to look back at this dyslexic mistake and laugh...

    Everything is now working as intended with one tiny tweak. In my earlier example document index writter I made a mistake on adding the Prefix Coded value to my index. Rather than adding codedLong to the index I accidentally added codedLang. This mean that when running my search I always returned a latitude of 0 which rather squewed results.

    With all of this resolved I now a working spatial search using lucene and a custom document index writer.

    Would have been greate to do this through examine but it's just to restricive.

    Oh well on to faceted search =)

  • Josh Wheelock 2 posts 22 karma points
    Jan 21, 2014 @ 18:30
    Josh Wheelock
    0

    Hi Drew,

    I've been working on a test that integrates the leaping gorillas example also, but I'm not getting any results. Any chance of you posting some code, or making it available for download somewhere?

     

  • Karl Kopp 121 posts 226 karma points
    May 21, 2014 @ 04:35
    Karl Kopp
    0

    Hi Drew,

    +1 for some sample code :)

  • Karl Kopp 121 posts 226 karma points
    May 21, 2014 @ 12:23
    Karl Kopp
    0

    OK - I got this working with Lucene 2.9.4.x, the one bundled with the latest Umbraco build (7.1.3). Hit me up if u need some code... I'll work on a blog post too, specifically for Umbraco...

  • Brendan McKenzie 3 posts 23 karma points
    Dec 03, 2014 @ 01:42
    Brendan McKenzie
    0

    Hey Karl.

    Did you ever get around to writing that blog post? I'm having trouble integrating the Leaping Gorilla post with Examine as well. Would love to see how you solved it.

    Cheers. Brendan

  • Karl Kopp 121 posts 226 karma points
    Dec 03, 2014 @ 01:48
    Karl Kopp
    0

    Hey Brendan - specifically which bits you having problems with? Do you get the Lucene index created with the relevant data (latCoded, lonCoded, LocationTierPrefix_1 - 15 etc)? Is it trouble pulling results from the index?

    Let me know, happy to help...

    Cheers!

  • Brendan McKenzie 3 posts 23 karma points
    Dec 03, 2014 @ 01:58
    Brendan McKenzie
    0

    Just querying the data out. I've managed to get the documents indexed just fine, and the lat/long/locationTier fields are all present.

  • Karl Kopp 121 posts 226 karma points
    Dec 03, 2014 @ 02:04
    Karl Kopp
    1

    The crux of the search is here. I grab the lat/lon of the user (using JavaScript location services) and search like this:

        if (double.TryParse(Request.QueryString["lat"], out _lat) && double.TryParse(Request.QueryString["lon"], out _long))
    {
    
        string indexPath = "~/App_Data/TEMP/ExamineIndexes/Venue/Index";
    
        var indexDirectory = FSDirectory.Open(new DirectoryInfo(HttpContext.Current.Server.MapPath(indexPath)));
    
        Lucene.Net.Search.IndexSearcher searcher = new Lucene.Net.Search.IndexSearcher(indexDirectory);
    
        double KmsToMiles = 0.621371192;
        double _radius = 300 * KmsToMiles;
    
        /*  Builder allows us to build a polygon which we will use to limit
        * search scope on our cartesian tiers, this is like putting a grid
        * over a map */
        var builder = new CartesianPolyFilterBuilder("LocationTierPrefix_");
    
        /*  Bounding area draws the polygon, this can be thought of as working
        * out which squares of the grid over a map to search */
        var boundingArea = builder.GetBoundingArea(_lat, _long, _radius);
    
        /*  We refine, this is the equivalent of drawing a circle on the map,
        *  within our grid squares, ignoring the parts the squares we are
        *  searching that aren't within the circle - ignoring extraneous corners
        *  and such */
        var distFilter = new LatLongDistanceFilter(boundingArea,
                            _radius,
                            _lat,
                            _long,
                            "latCoded",
                            "lonCoded");
    
        var masterQuery = new BooleanQuery();
    
        /*  Add our filter, this will stream through our results and determine eligibility */
        masterQuery.Add(new ConstantScoreQuery(distFilter), BooleanClause.Occur.MUST);
    
    
        // Get the searcher from examine
        Lucene.Net.Search.TopDocs results = searcher.Search(masterQuery, null, searcher.MaxDoc());
        sortedResults = results.ScoreDocs.Select(sd => new Helper.LocationSearchResult(sd.score, int.Parse(searcher.Doc(sd.doc).GetField("id").StringValue()), distFilter.GetDistance(sd.doc) / KmsToMiles)).OrderBy(x => x.DistanceInKms).Take(30).ToList();
    }
    

    What is the exact error you are getting?

  • Brendan McKenzie 3 posts 23 karma points
    Dec 03, 2014 @ 02:34
    Brendan McKenzie
    0

    I wasn't getting an error. I was just hoping to query through Examine directly without having to access the underlying Lucene IndexSearcher.

    Thanks for that. I guess I'll just have to bite the bullet and get it done that way.

    Cheers. Brendan

  • Karl Kopp 121 posts 226 karma points
    Dec 03, 2014 @ 03:46
    Karl Kopp
    0

    It's only a dozen lines of code, so it's not too bad. I tried everything to get it working through examine, but no love. I believe when the updated Lucene to 3.x, it will be easier, so pls vote for this ticket as well :)

Please Sign in or register to post replies

Write your reply to:

Draft