Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Sebastian Dammark 583 posts 1407 karma points
    Sep 23, 2015 @ 20:00
    Sebastian Dammark
    0

    Index contains no chinese nodes in frontend

    I have a website that consists of three languages, english, german and chinese.

    On the website there is 1 index that handles all search requests.

    My issue is that both english and german works fine, but chinese gives no hits. But the funny thing is, if I do the same search in backend, I get hits.

    When I do the search in frontend I do like this:

    Querystring looks like this: globalsearch?query=[chinese characters] (for some reason the forum removes the chinese characters)

    <xsl:variable name="query" select="umb:RequestQueryString('query')" />
    <xsl:variable name="result" select="ex:Search($query)" />
    

    And the output is

    <error>There were no search results.</error>
    

    But the same query in backend in the ExternalSearcher gives me 3 hits.

    Any logical explanation to this ?

  • Chriztian Steinmeier 2800 posts 8791 karma points MVP 8x admin c-trib
    Sep 30, 2015 @ 08:59
    Chriztian Steinmeier
    1

    Hi Sebastian,

    If the ex:Search() extension is using the same search index etc. as the backend (I have no idea if that is in fact the case) - my first check would be to see if the Chinese nodes are in fact published? (XSLT can only "see" published content). Next up, I'd try to find out if the extension is written with limited encoding support (unintentionally, probably), but that's a little harder to figure out...

    /Chriztian

  • Douglas Robar 3570 posts 4711 karma points MVP ∞ admin c-trib
    Sep 30, 2015 @ 10:16
    Douglas Robar
    2

    Hi, Sebastian,

    Check the index itself to see if it has any of the Chinese in there. I suspect it doesn't. At least, not the index for the public website (rather than the backoffice... they are different indexes).

    1. Go to the Developer section.
    2. Select the the Examine Management tab.
    3. Expand the Searchers > InternalSearcher item
    4. Search for some Chinese. I bet you find it. This is the index used by the back office.
    5. Expand the Searchers > ExternalSearcher item
    6. Search for the same Chinese. Do you find it, or is it missing? This is the index used by the website for searching.

    If it is there in the Internal index but not the External index you'll want to look at the settings for the indexer. These resources will help: https://our.umbraco.org/search?q=examine&cat=documentation

    Let us know what you find out!

    cheers,
    doug.

  • Sebastian Dammark 583 posts 1407 karma points
    Sep 30, 2015 @ 11:42
    Sebastian Dammark
    0

    Hi Doug

    The internal searcher gives me 1 hit and the external searcher gives me 3 hits, when I search on the same chinese phrase.

    But in frontend ex:Search() returns 0 hits, that's what really puzzles me.

  • Douglas Robar 3570 posts 4711 karma points MVP ∞ admin c-trib
    Sep 30, 2015 @ 11:48
    Douglas Robar
    0

    At the risk of offending our XSLT preferences... you might give ezSearch a try. It is like XSLTsearch only better because it uses Examine (and has limited media searching as well, if you want it).

    https://our.umbraco.org/projects/website-utilities/ezsearch/

    Easy to extend and debug with VisualStudio if you want to set a breakpoint and figure out what is (or isn't) getting to your query and what is (or isn't) coming back.

    It installs a doctype, macro partial file, template, and macro. All are called ezSearch so it shouldn't get in the way of anything else in your site. Though testing on a copy of the site is always prudent :)

    It might be a way to compare if ex:Search() is doing what you think it's doing. ezSearch should return the same thing as you get via the Developer section.

    cheers,
    doug.

  • Sebastian Dammark 583 posts 1407 karma points
    Oct 13, 2015 @ 08:47
    Sebastian Dammark
    0

    Finally I had the time to try out ezSearch.

    Which unfortunately also returns 0 hits in the frontend.

    So, right now it looks like that Umbraco doesn't support search on the chinese characters, out of the box.

  • Douglas Robar 3570 posts 4711 karma points MVP ∞ admin c-trib
    Oct 15, 2015 @ 13:08
    Douglas Robar
    0

    Hi, Sebastian,

    I just tried making a simple page in a 7.3.0 site. I named the page "This is a test Chinese page" so I could find it easily in the index. It was in both the internal and external indexes immediately, as seen in the Examine Management.

    In the bodyText property of the page (a richtext editor in my case) I added a google translation of "This is a test" in both simplified and traditional Chinese, according to https://translate.google.co.uk/?ie=UTF-8&hl=en&client=tw-ob#en/zh-CN/This%20is%20a%20test

    Saved and published the page and again the characters were in the indexes as expected.

    And ezSearch had no problem at all finding them as you'd expect. For that matter, any lucene/examine search would find them.

    enter image description here

    I wonder, do you have some odd Analyzer setting that is removing Chinese characters from your indexes? I'm using the default settings, which uses the WhitespaceAnalyzer.

    cheers,
    doug.

  • Sebastian Dammark 583 posts 1407 karma points
    Oct 21, 2015 @ 09:26
    Sebastian Dammark
    0

    Very wierd. I haven't changed anything in the config files, so they should be pretty default.

    This is how it looks in frontend (attachment 1)enter image description here In the textarea you can "see" the raw output of:

    ex:Search(umb:RequestQueryString('query'))
    

    And backend (attachment 2)enter image description here

    Could I have a look at your config files, just to see if something is messed up ?

  • Douglas Robar 3570 posts 4711 karma points MVP ∞ admin c-trib
    Oct 22, 2015 @ 13:03
    Douglas Robar
    0

    I notice that you don't have a 'bodyText' field, which I do. I'm not an expert on the ex:Search() extension but I wonder... what fields in the index are being searched?

    At least with ezSearch, you can (and should!) specify the fields to search in, though the default for ezSearch is the nodeName, metaKeywords, metaDescription, and bodyText fields. Which is why I was able to search for the foreign characters.

    For your situation it looks as though those fields either don't exist or aren't relevant and that you'd want to specifically search in the metaKeywords, metaDescription, metaTitle, navigationTitle and header, at least given the data for these two nodes.

    For a quick test... make a new node with a nodeName of something in Chinese. Then search for that. I bet it's found even though it's in Chinese because (I'm guessing but would be very surprised if it isn't the case) the nodeName is going to be searched by default.

    Let us know what you find out.

    cheers,
    doug.

  • Sebastian Dammark 583 posts 1407 karma points
    Oct 26, 2015 @ 09:54
    Sebastian Dammark
    0

    Well, in the ExamineIndex.config right above the ExternalIndexSet I see this comment:

    <!-- Default Indexset for external searches, this indexes all fields on all types of nodes-->
    

    Which indicates to me that all fields are indexed if nothing else is specified.

    But I actually have 1 page that's named something chinese in nodeName and this page I can find. So there seems to be something fishy.

    99% of the chinese pages are english in nodeName due to URL's and then we specify the chinese name in another field (navigationTitle). So if I add navigationTitle the indexed fields my problems might go away.

  • Sebastian Dammark 583 posts 1407 karma points
    Oct 26, 2015 @ 11:06
    Sebastian Dammark
    0

    Update

    When searching with chinese characters this is the case:

    1. Nodes where nodeName is chinese and navigationTitle is english are found
    2. Nodes where nodeName is english and navigationTitle is chinese are NOT found

    My index is setup like this:

    <IndexSet SetName="ExternalIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/External/">
    <IndexAttributeFields>
      <add Name="nodeName" />
      <add Name="navigationTitle" />
    </IndexAttributeFields></IndexSet>
    

    So chinese nodes are only found if nodeName is in chinese. And since 99% of the nodes are named in english, but in chinese in navigationTitle, they're not found even though navigationTitle is specified in the index settings.

  • Douglas Robar 3570 posts 4711 karma points MVP ∞ admin c-trib
    Oct 26, 2015 @ 11:38
    Douglas Robar
    0

    That's the behaviour I thought you might find. And it's good news that the search is working in a predictable way, even if it isn't the way you want.

    Given that you see the properties in the index (from a screenshot earlier in the thread) I don't think it is something that has to do with what is indexed. Rather, it has to do with what is searched.

    This is implementation-specific. With ezSearch, we have a macro parameter that lets you specify which doctype properties to search within. I'm not sure how your installation does that. But that's where I'd look. I suspect you aren't actually searching within the navigationTitle and other important fields.

    Some links that might be helpful if you haven't already seen them: https://github.com/Shazwazza/Examine https://github.com/Shazwazza/Examine/wiki https://our.umbraco.org/documentation/Reference/Searching/Examine/ https://our.umbraco.org/documentation/Reference/Searching/Examine/overview-explanation

    That last link is probably the most important one.

    Let us know what you find out!

    cheers,
    doug.

  • Alex Lindgren 159 posts 356 karma points
    Apr 08, 2016 @ 18:04
    Alex Lindgren
    0

    Sebastian,

    Were you able to come up with a solution?

    Alex

Please Sign in or register to post replies

Write your reply to:

Draft