when i go into the back office and use the lucene search on the external index and search for doubleextralarge i get results, however when i search for the name of that same attribute like this XXL, which is a more typical search term i return nothing. Is there a better way than to try and extract the name instead of searching the alias of the variant info?
Have you done anything to the default index? or are you just querying the default data that gets put into the index?
If the later, you might want to intercept the lucene index being written and maybe break up some of the things you want to search on (such as the attributes) into a more search friendly form.
I have not done anything to the default Lucene index I am just querying data that gets put into the index at face value.
I think I am looking for the later here but am sort of new to Lucene and Examine. Ive followed Paul Seals videos on Searching thus far and taken my time to make sure it all makes sense, which it does, its always a good thing lol.
I was looking at the Vendr Demo Store and found your ConfigureExamine(); method but it looks like your simply ensuring that a product page is included into the index?
Do you have a sort of starting point or recommended method to get this done?
Sure, I guess the first questions though really are
What version of Vendr / Umbraco are we talking?
How are your products set up? Are we using the variants prop editor? or using child nodes as variants?
What are you actually needing to achieve here? Are you searching on the variants attributes? Or do you have some kind of category defined on them that you are wanting to search on.
Once I know a bit more, I can point you in the right direction.
Products are setup under a products node and then each Product uses the variants prop editor
What I am actually trying to achieve is something like the following:
When a user searches for XS Blue Glove, I return all results that have those attributes, XS being a size attribute name, Blue also being an different attribute name and Glove coming from either the product name/category/or somewhere else.
I would like the attributes to hold the most importance alongside some other fields, which I am already getting decent results with.
I have put Fuzzy searching on the terms I search as well as Boosted the fields I search, so i can play with those a bit to get closer.
I do search the variants field but it looks like that variants field only holds the alias' not the name of the attribute.
In my instance, products hold the most weight and I would like to search the variants with higher priority than any other field on a product.
Sure, so the thing with the variants editor is that this is just a property editor and we don't do anything special with it's value right now for indexing and really all you will get in your examine index is all your variants dumped into a single input field. In addition, we only use the product attribute aliases for storing the selected product attributes for a variant so this will be why you are only seeing these.
So it sounds like that you want to search the individual variant combinations, which I don't think you can achieve with the current index as all the variants are stored in that one field. What you might want to look at doing is maybe creating your own lucene index for variants and maybe process the variants values and split them up into their individual variant entries and store each of those in the lucene index. At the same time, you can lookup the product attribute values and store those in the index too.
I have a custom lucene index setup and running, however I have a few questions on actually getting the right data into the fields.
I am a bit confused on how to structure the data that gets put into the index.
For example, take this code, the IndexValueSetBuilder:
public class ProductAttributeIndexValueSetBuilder : IValueSetBuilder<Vendr.Core.Models.Attribute>
{
public IEnumerable<ValueSet> GetValueSets(params Vendr.Core.Models.Attribute[] attrs)
{
foreach(var attr in attrs)
{
var indexValues = new Dictionary<string, object>
{
["alias"] = attr.Alias,
["name"] = attr.Name
};
var valueSet = new ValueSet(attr.ToString(), "attribute", indexValues);
yield return valueSet;
}
}
}
basically my understanding is that this method is the way the data gets thrown into the index and thus is then searchable by the values you put into here.
So if I want the product name and all of the attribute names, and attribute value names of a given product, how would you suggest i get to that? Create a list of custom objects that then iterates over the properties and adds them to the dictionary?
Again, my end goal is to be able to search for something like xs diamond glove or XXXL Blue Diamond Glove etc.
Is my thinking fairly close on this? Took me am minute to write this AND understand what was going on but i think I've done ok.
I have a GetAll() method in my service that is responsible for pulling info from the umbracoContextFactory and the VendrAPI
Any light shed on this would be awesome. Thanks!
P.S. Nice Podcast, enjoying is so far, very interesting
Yea, you'll want to do something like use the ProductAttributeService to fetch the attributes and get their actual names. You can basically put as much info into the index as you need and hopefully this is only happening when the index is being built / products are being published so you shouldn't be performing any expensive tasks on a regular basis.
You could also maybe just fetch the published product node and use the Variants value converter to get you the in use attributes.
var attrs = productPage.Variants.GetInUseProductAttributes();
This should then give you access to both the alias and names of the attributes that are in use for the given variants in the product so you shouldn't have to do the filtering yourself.
Matt, I'm populating a custom index called VendrVariantsIndex...perhaps that's my issue,
I don't have a config for the index itself, I only have the following:
IndexCreator, IndexValueSetBuilder, IndexModel, IndexPopulator, IndexComponent, IndexComposer, and a service that gets all the values I need
Everything is prefixed with my custom index name. So I generate the index fine and used Luke to look at the index itself and it does in fact hold what I expect. I just cant search it.
Perhaps that is where the config comes in, as I don't have one...sigh...sometimes i feel like I'm just learning again. Could I model this off of the external index?
I did see an example where JonDJones had a config but I was trying to make sense of all the pieces I needed to make this work.
Now that Im not on mobile I can ellaborate a bit more, here is my indexCreator class that is responsbile for creating the index.
public class VendrVariantIndexCreator : LuceneIndexCreator
{
private readonly IProfilingLogger _logger;
public VendrVariantIndexCreator(IProfilingLogger logger)
{
_logger = logger;
}
public override IEnumerable<IIndex> Create()
{
var index = new VendrVariantsIndex("VendrVariantIndex",
CreateFileSystemLuceneDirectory("VendrVariantIndex"),
new FieldDefinitionCollection(
new FieldDefinition("productName", FieldDefinitionTypes.FullText),
new FieldDefinition("variantName", FieldDefinitionTypes.FullText),
new FieldDefinition("variantAlias", FieldDefinitionTypes.FullText),
new FieldDefinition("variantValueName", FieldDefinitionTypes.FullText),
new FieldDefinition("variantValueAlias", FieldDefinitionTypes.FullText)
),
new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30),
_logger);
return new[] { index };
}
}
Here is my ValueSetBuilder:
public class ProductAttributeIndexValueSetBuilder : IValueSetBuilder<VendrVariantIndexModel>
{
public IEnumerable<ValueSet> GetValueSets(params VendrVariantIndexModel[] attrs)
{
foreach(var attr in attrs)
{
var indexValues = new Dictionary<string, object>
{
["variantAlias"] = attr.VariantAlias,
["vairantName"] = attr.VariantName,
["variantValueAlias"] = attr.VariantValueAlias,
["variantValueName"] = attr.VariantValueName,
["productName"] = attr.ProductName
};
var valueSet = new ValueSet(attr.ToString(), "node", indexValues);
yield return valueSet;
}
}
}
Here is my populator:
public class VendrVariantIndexPopulator : IndexPopulator
{
private readonly ProductAttributeIndexValueSetBuilder _productAttributeIndexValueSetBuilder;
private readonly IVendrVariantsService _vendrVariantsService;
public VendrVariantIndexPopulator(ProductAttributeIndexValueSetBuilder productAttributeIndexValueSetBuilder, IVendrVariantsService vendrVariantsService)
{
_productAttributeIndexValueSetBuilder = productAttributeIndexValueSetBuilder;
_vendrVariantsService = vendrVariantsService;
RegisterIndex("VendrVariantIndex");
}
protected override void PopulateIndexes(IReadOnlyList<IIndex> indexes)
{
var variants = _vendrVariantsService.GetAll().ToArray();
foreach (var index in indexes)
{
index.IndexItems(_productAttributeIndexValueSetBuilder.GetValueSets(variants));
}
}
}
The component:
public class VendrVariantsIndexComponent : IComponent
{
private readonly IExamineManager _examineManager;
private readonly VendrVariantIndexCreator _vendrVariantIndexCreator;
public VendrVariantsIndexComponent(IExamineManager examineManager, VendrVariantIndexCreator vendrVariantIndexCreator)
{
_examineManager = examineManager;
_vendrVariantIndexCreator = vendrVariantIndexCreator;
}
public void Initialize()
{
foreach (var index in _vendrVariantIndexCreator.Create())
_examineManager.AddIndex(index);
}
public void Terminate() { }
}
Finally the composer:
[RuntimeLevel(MinLevel = RuntimeLevel.Run)]
public class VendrVariantsIndexComposer : IUserComposer
{
public void Compose(Composition composition)
{
composition.Components().Append<VendrVariantsIndexComponent>();
composition.RegisterUnique<ProductAttributeIndexValueSetBuilder>();
composition.Register<VendrVariantIndexPopulator>(Lifetime.Singleton);
composition.RegisterUnique<VendrVariantIndexCreator>();
composition.RegisterUnique<IVendrVariantsService, VendrVariantsService>();
}
}
Hmm, from what I understand this all looks correct to me. I've just reviewed this article from Skrift by Paul Marden (who I know knows what he's doing :D) https://skrift.io/issues/examine-in-umbraco-8/ (scroll down to "Indexing and searching external content") and what you have looks pretty identical.
The ONLY thing I can maybe think of is whether the search terms are being blocked by the search analyser for being too short, ie "xl". If you search for longer values, or variant names, do you get some actual results then?
Really good work on getting this far Kyle. I know how tricky learning this stuff is the first time as it's a lot of pulling different resources together but it is a super valuable skillset you are developing with this as it does come in handy quite often.
I had thought maybe the terms were to short to start, but what kicked off the "Im not getting any results idea" was the fact that I searched for the variant alias of extralarge which maps to the xl/XL name for said vairant and I get the following:
Let me read over Paul's article, again, as I have read that in my trials and efforts while implementing this. What has me confused is that when I analyze with Luke to physically look at the index I can see the values in the index for each field type but even within Luke I cant return a result using the search.
Maybe I'm mistaken, but doesn't the search in the back office take a lucene syntax search string (ie the same as what you are entering into LUKE for your search) so it needs to be
Hmm interesting, maybe it does, though when searching the external index I could search for “extralarge” and it would return nodes to me.
I’ll give it a shot though and see, if that is the case and I get results using a lucene search syntax how does that translate to a regular keyword search then using that index.
Sorry if that is confusing, I worded it the best I could for you.
Generally speaking Examine is just a wrapper around Lucene so when you use it's API to perform a search, all it is doing is generating Lucene queries under the hood.
To be honest, I don't know much about Examines API as when I've done custom stuff before, I've tended to find it easier to generate the exact Lucene queries and execute them on the index as a raw query.
So I'd usually split the search terms and just generate things queries like
+variantValueAlias:term +variantValueName:term
etc, where you just perform the term search across the different fields you want to sample.
Maybe the Examine API can simplify this though so it might be worth looking at that first.
I also wonder if I can just add these additional fields to the external index and use it that way instead of writing a completely custom index like ive done.
I would need to look up how to do that and make sure I understand it prior to implementation but. Maybe, just Maybe it could resolve some issues I am having with the custom index...
If I didn't have complex variants this wouldn't be an issue and I could just use the external index to search, but with the way variants are stored I don't have access to the Names of the variants only alias'
I think the custom index is a better option here because you are creating multiple items from a single node. I also prefer not to mess too much with the default indexes.
I'm not sure if Callum uses discord but he's generally my go to Lucene person so maybe you could try and get his input somehow 🤔
Ok souns like a custom Lucene index is in need then for this use case. Never made one before and relatively new to the Searching bits with Examine/Lucene.
Use Examine/Lucene to search variant info
Matt,
Ive begun writing site search using Examine and Lucene to quickly query nodes within my ecommerce store.
I already provide my query a GroupedOr like this:
when i go into the back office and use the lucene search on the external index and search for
doubleextralarge
i get results, however when i search for the name of that same attribute like thisXXL
, which is a more typical search term i return nothing. Is there a better way than to try and extract the name instead of searching the alias of the variant info?Thanks
Hey Kyle,
Have you done anything to the default index? or are you just querying the default data that gets put into the index?
If the later, you might want to intercept the lucene index being written and maybe break up some of the things you want to search on (such as the attributes) into a more search friendly form.
Matt
Matt,
I have not done anything to the default Lucene index I am just querying data that gets put into the index at face value.
I think I am looking for the later here but am sort of new to Lucene and Examine. Ive followed Paul Seals videos on Searching thus far and taken my time to make sure it all makes sense, which it does, its always a good thing lol.
I was looking at the Vendr Demo Store and found your
ConfigureExamine();
method but it looks like your simply ensuring that a product page is included into the index?Do you have a sort of starting point or recommended method to get this done?
Thanks!
Hi Kyle,
Sure, I guess the first questions though really are
Once I know a bit more, I can point you in the right direction.
Matt
Matt,
Vendr version 2.1.0 / Umbraco 8.16
Products are setup under a products node and then each Product uses the variants prop editor
What I am actually trying to achieve is something like the following:
When a user searches for
XS Blue Glove
, I return all results that have those attributes,XS
being a size attribute name,Blue
also being an different attribute name andGlove
coming from either the product name/category/or somewhere else.I would like the attributes to hold the most importance alongside some other fields, which I am already getting decent results with.
I have put Fuzzy searching on the terms I search as well as Boosted the fields I search, so i can play with those a bit to get closer.
I do search the
variants
field but it looks like that variants field only holds the alias' not the name of the attribute.In my instance, products hold the most weight and I would like to search the variants with higher priority than any other field on a product.
Hope that makes sense.
-Kyle
Hey Kyle,
Sure, so the thing with the variants editor is that this is just a property editor and we don't do anything special with it's value right now for indexing and really all you will get in your examine index is all your variants dumped into a single input field. In addition, we only use the product attribute aliases for storing the selected product attributes for a variant so this will be why you are only seeing these.
So it sounds like that you want to search the individual variant combinations, which I don't think you can achieve with the current index as all the variants are stored in that one field. What you might want to look at doing is maybe creating your own lucene index for variants and maybe process the variants values and split them up into their individual variant entries and store each of those in the lucene index. At the same time, you can lookup the product attribute values and store those in the index too.
It's a bit of an advanced topic though creating your own indexes so maybe checkout the docs here first https://our.umbraco.com/Documentation/Reference/Searching/Examine/indexing/
Matt,
More specifically it seems like I need a custom IValueSetValidator or an IValueSet that holds the string values of all of my variation combinations.
Going to be a lot of learning for this, which is good.
Thanks for everything thus far
Matt,
I have a custom lucene index setup and running, however I have a few questions on actually getting the right data into the fields.
I am a bit confused on how to structure the data that gets put into the index.
For example, take this code, the
IndexValueSetBuilder
:basically my understanding is that this method is the way the data gets thrown into the index and thus is then searchable by the values you put into here.
So if I want the product name and all of the attribute names, and attribute value names of a given product, how would you suggest i get to that? Create a list of custom objects that then iterates over the properties and adds them to the dictionary?
Again, my end goal is to be able to search for something like
xs diamond glove
orXXXL Blue Diamond Glove
etc.Is my thinking fairly close on this? Took me am minute to write this AND understand what was going on but i think I've done ok.
I have a
GetAll()
method in my service that is responsible for pulling info from theumbracoContextFactory
and theVendrAPI
Any light shed on this would be awesome. Thanks!
P.S. Nice Podcast, enjoying is so far, very interesting
Hey Keyle,
Yea, you'll want to do something like use the
ProductAttributeService
to fetch the attributes and get their actual names. You can basically put as much info into the index as you need and hopefully this is only happening when the index is being built / products are being published so you shouldn't be performing any expensive tasks on a regular basis.You could also maybe just fetch the published product node and use the Variants value converter to get you the in use attributes.
This should then give you access to both the alias and names of the attributes that are in use for the given variants in the product so you shouldn't have to do the filtering yourself.
Hope this helps
Matt
Matt,
Ive gotten each product with each individual vairant in an object that holds all three seperately, for example
I then allow that to populte the index like this:
However, when searching this index in the back office I am expecting to search something like XS or L, as those are the attribute names.
I return no results when searching for anything like product name, attribute alias, or attribute name.
I know this is creeping to outside of vendr support but your knowledge on lucene most likely surpasses mine.
Any help is appreciated
Hey Kyle,
What index are you populating? An existing index? or a custom one? If custom, what's the config look like for defining your index?
To be honest, it's been a while since I've done any custom index work, but I'll see if I can spot anything đź‘Ť
Matt
Matt, I'm populating a custom index called VendrVariantsIndex...perhaps that's my issue,
I don't have a config for the index itself, I only have the following:
Everything is prefixed with my custom index name. So I generate the index fine and used Luke to look at the index itself and it does in fact hold what I expect. I just cant search it.
Perhaps that is where the config comes in, as I don't have one...sigh...sometimes i feel like I'm just learning again. Could I model this off of the external index?
I did see an example where JonDJones had a config but I was trying to make sense of all the pieces I needed to make this work.
Thanks Matt.
Matt,
Now that Im not on mobile I can ellaborate a bit more, here is my indexCreator class that is responsbile for creating the index.
Here is my ValueSetBuilder:
Here is my populator:
The component:
Finally the composer:
Hmm, from what I understand this all looks correct to me. I've just reviewed this article from Skrift by Paul Marden (who I know knows what he's doing :D) https://skrift.io/issues/examine-in-umbraco-8/ (scroll down to "Indexing and searching external content") and what you have looks pretty identical.
The ONLY thing I can maybe think of is whether the search terms are being blocked by the search analyser for being too short, ie "xl". If you search for longer values, or variant names, do you get some actual results then?
Really good work on getting this far Kyle. I know how tricky learning this stuff is the first time as it's a lot of pulling different resources together but it is a super valuable skillset you are developing with this as it does come in handy quite often.
Matt,
I had thought maybe the terms were to short to start, but what kicked off the "Im not getting any results idea" was the fact that I searched for the variant alias of
extralarge
which maps to thexl/XL
name for said vairant and I get the following:Let me read over Paul's article, again, as I have read that in my trials and efforts while implementing this. What has me confused is that when I analyze with Luke to physically look at the index I can see the values in the index for each field type but even within Luke I cant return a result using the search.
Here s what I am talking about using Luke:
a Search:
looking at the values within the index:
Really stumped on this one.
Any other Ideas?
Thank for the kind words Matt. Appreciate it!
Maybe I'm mistaken, but doesn't the search in the back office take a lucene syntax search string (ie the same as what you are entering into LUKE for your search) so it needs to be
Hmm interesting, maybe it does, though when searching the external index I could search for “extralarge” and it would return nodes to me.
I’ll give it a shot though and see, if that is the case and I get results using a lucene search syntax how does that translate to a regular keyword search then using that index.
Sorry if that is confusing, I worded it the best I could for you.
Generally speaking Examine is just a wrapper around Lucene so when you use it's API to perform a search, all it is doing is generating Lucene queries under the hood.
To be honest, I don't know much about Examines API as when I've done custom stuff before, I've tended to find it easier to generate the exact Lucene queries and execute them on the index as a raw query.
So I'd usually split the search terms and just generate things queries like
etc, where you just perform the term search across the different fields you want to sample.
Maybe the Examine API can simplify this though so it might be worth looking at that first.
Matt,
No luck search by using a lucene search format.
Ive also asked a similiar question in the umbraco discord to see if anyone there can shed light on this.
But I do question if I still am missing something that i am over looking when i wrote this.
Ah i see,
I also wonder if I can just add these additional fields to the external index and use it that way instead of writing a completely custom index like ive done.
I would need to look up how to do that and make sure I understand it prior to implementation but. Maybe, just Maybe it could resolve some issues I am having with the custom index...
If I didn't have complex variants this wouldn't be an issue and I could just use the external index to search, but with the way variants are stored I don't have access to the Names of the variants only alias'
I think the custom index is a better option here because you are creating multiple items from a single node. I also prefer not to mess too much with the default indexes.
I'm not sure if Callum uses discord but he's generally my go to Lucene person so maybe you could try and get his input somehow 🤔
Let me see if Callum is in the discord, not sure at the moment...
I agree, muddying the default index seems a bit counter intuitive honestly.
Let me see If I can get some others input on this, Ill be in touch either way
Matt,
Ok souns like a custom Lucene index is in need then for this use case. Never made one before and relatively new to the Searching bits with Examine/Lucene.
Let me read the documentation etc.
is working on a reply...