Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Simon steed 376 posts 688 karma points
    Aug 22, 2011 @ 15:12
    Simon steed
    0

    Tales of Woe with Examine Wildcard Searches

    Need help with this one, really struggling to get it working.

    OK i've got a site with articles. I want to allow the user to search by keyword and phrase so:

    searching for How would pull back articles with How in the title, body etc

    searching for How To would also pull back articles with How To in the title, body etc

    Sooooo to this end i've setup the following:

    ExamineIndex.config:

    <ExamineLuceneIndexSets>

      <IndexSet SetName="KBIndexSet"

                 IndexPath="~/App_Data/ExamineIndexes/KBSearcher/"

                 IndexParentId="1640">

        <IndexAttributeFields>

          <add Name="id" EnableSorting="true" Type="Number" />

          <add Name="nodeName" EnableSorting="true" />

          <add Name="updateDate" EnableSorting="true" Type="DateTime" />

          <add Name="createDate" EnableSorting="true" Type="DateTime" />

          <add Name="writerName" />

          <add Name="loginName" />

          <add Name="email" />

          <add Name="nodeTypeAlias" />

          <add Name="updateDate" />

          <add Name="createDate" />

          <add Name="pageID" />

        </IndexAttributeFields>

        <IndexUserFields>

          <add Name="pageTitle" />

          <add Name="bodyText" />

          <add Name="search" />

          <add Name="articleTitle" />

          <add Name="articleType" />

          <add Name="product" />

          <add Name="category" />

          <add Name="symptoms" />

          <add Name="purpose" />

          <add Name="resolution" />

          <add Name="additionalinformation" />

          <add Name="keywordsToSearchUpon" />

        </IndexUserFields>

        <IncludeNodeTypes/>

        <ExcludeNodeTypes />

      </IndexSet>

     

        <!-- The internal index set used by Umbraco back-office - DO NOT REMOVE -->

        <IndexSet SetName="InternalIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/Internal/">

          <IndexAttributeFields>

            <add Name="id" />

            <add Name="nodeName" />

            <add Name="updateDate" />

            <add Name="writerName" />

            <add Name="path" />

            <add Name="nodeTypeAlias" />

            <add Name="parentID" />

          </IndexAttributeFields>

          <IndexUserFields />

          <IncludeNodeTypes/>

          <ExcludeNodeTypes />

        </IndexSet>

        <!-- Removed internal umbraco indexer for clarity of post -->

    </ExamineLuceneIndexSets>

    ExamineSettings.Config

    <ExamineSearchProviders defaultProvider="InternalSearcher">

        <providers>

          <add name="KBSearcher"

               runAsync="true"

               type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"

               analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"  

               enableLeadingWildcards="true"

              indexSet="KBIndexSet" />

     

          <add name="InternalSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"

               analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>

     

          <add name="InternalMemberSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"  analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net" enableLeadingWildcards="true"/>

        </providers>

      </ExamineSearchProviders>

     

    In my class, i've got the following fields arrayed:

    private string[] searchArrayGeneral = new string[] { "articleTitle", "articleType", "symptoms", "nodeName", "pageTitle", "bodyText", "search", "product", "category", "purpose", "resolution", "additionalinformation", "keywordsToSearchUpon"};

     Method called to generate the results

     private void SearchSiteGeneral()

            {

                var criteria = ExamineManager.Instance.

                                SearchProviderCollection["KBSearcher"]

                                .CreateSearchCriteria(UmbracoExamine.IndexTypes.Content);

                var filter = criteria

                             .GroupedOr(searchArrayGeneral, SearchTerm.MultipleCharacterWildcard())

                             .Not()

                             .Field("umbracoNaviHide", "1")

                             .Compile();

                SearchResults = ExamineManager.Instance.SearchProviderCollection["KBSearcher"].Search(filter);

                SearchResultListing.DataSource = SearchResults;

                SearchResultListing.DataBind();

            }

    Output string from filter when searching for 'How To'

    { SearchIndexType: content, LuceneQuery: +(+(articleTitle:how to* articleType:how to* symptoms:how to* nodeName:how to* pageTitle:how to* bodyText:how to* search:how to* product:how to* category:how to* purpose:how to* resolution:how to* additionalinformation:how to* keywordsToSearchUpon:how to*) -umbracoNaviHide:1) +__IndexType:content }

    Output string from filter when searching for 'How'

     

    { SearchIndexType: content, LuceneQuery: +(+(articleTitle:how* articleType:how* symptoms:how* nodeName:how* pageTitle:how* bodyText:how* search:how* product:how* category:how* purpose:how* resolution:how* additionalinformation:how* keywordsToSearchUpon:how*) -umbracoNaviHide:1) +__IndexType:content }

     

    Now the 'How' search works and pulls back the results, however 'How To' does not pull any results back at all. There are a stack of articles called How To so this should be valid.

    Any ideas? I'm completely lost and stumped now?

    Cheers

    Si

  • Mike Taylor 155 posts 353 karma points
    Aug 22, 2011 @ 16:10
    Mike Taylor
    0

    Hi there

    I've been doing something similar on a recent job, and I ended up writing the Lucene syntax and passing it to the RawQuery function rather than using the API chaining (largely because chaining seems ambiguous to me in terms of precedence).

    Phrases need to be in quotes, but I don't believe you can do wildcard searches on phrases - see http://lucene.apache.org/java/2_4_0/queryparsersyntax.html

    So, to do a search for "How to", your syntax would need to be something like this:

    +( (+articleTitle:"How to") (+articleType:"How to") (+symptoms:"How to") ...etc... ) -umbracoNaviHide:1

    Does that help at all?

    Mike

  • Simon steed 376 posts 688 karma points
    Aug 22, 2011 @ 16:20
    Simon steed
    0

    Thanks for the reply Mike

    Alas no, does not seem to work - tried both single and double quotes but no results returned for either search phrase. I'm sure it can be done as read a few posts about it, problem being that i've not yet found a complete implementation that actually works, only two line or so snippets which have lots of stuff missing.

    Si

  • Simon steed 376 posts 688 karma points
    Aug 23, 2011 @ 12:43
    Simon steed
    0

    Anyone else got any ideas on how to implement this?

  • Simon steed 376 posts 688 karma points
    Aug 23, 2011 @ 13:05
    Simon steed
    0

    Sorted - after googling lucene search syntax, I found a reference to using Proximity searching, the following works:

    SearchTerm.Proximity(3) - searches the phrase 3 words from the current :-)

    Si

Please Sign in or register to post replies

Write your reply to:

Draft