But, when I try to search by letter category, 'A' returns ALL the authors, but 'B', 'C', 'D', etc all return the correct results. Is this a bug in my code?
public IEnumerable<ISearchResult> GetAuthorsFromLetterCategory(char letter, out long totalItemCount)
{
totalItemCount = 0;
if (_examineManager.TryGetIndex(AuthorsIndex.INDEX_NAME, out var index))
{
var results = index
.Searcher
.CreateQuery()
.Field("__IndexType", "author")
.And().Field("letterCategory", letter.ToString().Boost(10f))
.Execute();
totalItemCount = results.TotalItemCount;
if (results.Any())
{
return results;
}
}
return [];
}
I did find some suggestions online, but none has worked so far. Has anyone encountered this situation before?
Tried
options.Analyzer = new StandardAnalyzer(LuceneVersion.LUCENE_48, CharArraySet.EMPTY_SET);
options.Analyzer = new KeywordAnalyzer(); // made no difference either
I also tried creating a custom analyzer:
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Util;
namespace QuoteTab.Umbraco.Core.Analyzers
{
public class NoStopWordsAnalyzer : Analyzer
{
protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader)
{
// Use a standard tokenizer
Tokenizer tokenizer = new StandardTokenizer(LuceneVersion.LUCENE_48, reader);
// Directly pass the tokenizer without adding a stop word filter
TokenStream tokenStream = tokenizer;
return new TokenStreamComponents(tokenizer, tokenStream);
}
}
}
did register it with the custom index options:
if (!string.IsNullOrEmpty(name) && name.Equals(AuthorsIndex.INDEX_NAME))
{
options.Analyzer = new NoStopWordsAnalyzer();
options.FieldDefinitions = new FieldDefinitionCollection(
new FieldDefinition("slug", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("name", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("letterCategory", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("birthPlace", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("city", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("state", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("professionId", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("nationalityId", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("nationalityProfessionId", FieldDefinitionTypes.FullTextSortable)
);
options.UnlockIndex = true;
if (_settings.Value.LuceneDirectoryFactory == LuceneDirectoryFactory.SyncedTempFileSystemDirectoryFactory)
{
// if this directory factory is enabled then a snapshot deletion policy is required
options.IndexDeletionPolicy = new SnapshotDeletionPolicy(new KeepOnlyLastCommitDeletionPolicy());
}
}
var searcher = (BaseLuceneSearcher)index.Searcher;
var analyzer = new StandardAnalyzer(LuceneVersion.LUCENE_48, CharArraySet.EMPTY_SET);
var query = searcher.CreateQuery(null, BooleanOperation.And, analyzer, new Examine.Lucene.Search.LuceneSearchOptions() { });
I tried with new StandardAnalyzer(LuceneVersion.LUCENE_48, CharArraySet.EMPTY_SET); and also tried like this:
public IEnumerable<ISearchResult> GetAuthorsFromLetterCategory(char letter, out long totalItemCount)
{
totalItemCount = 0;
if (_examineManager.TryGetIndex(AuthorsIndex.INDEX_NAME, out var index))
{
var searchTerm = letter.ToString();
var analyzer = new NoStopWordsAnalyzer();
var searcher = (BaseLuceneSearcher)index.Searcher;
var query = searcher
.CreateQuery(null, BooleanOperation.And, analyzer, new Examine.Lucene.Search.LuceneSearchOptions() { })
.Field("__IndexType", "author")
.And().Field("letterCategory", searchTerm)
.Execute();
totalItemCount = query.TotalItemCount;
if (query.Any())
{
return query;
}
}
return [];
}
The results are retuing 0 now, instead of all of them, so at least something changed!
Is it possible that I'm setting up my index options incorrectly?
if (!string.IsNullOrEmpty(name) && name.Equals(AuthorsIndex.INDEX_NAME))
{
options.Analyzer = new NoStopWordsAnalyzer();
options.FieldDefinitions = new FieldDefinitionCollection(
new FieldDefinition("slug", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("name", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("letterCategory", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("birthPlace", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("city", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("state", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("professionId", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("nationalityId", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("nationalityProfessionId", FieldDefinitionTypes.FullTextSortable)
);
options.UnlockIndex = true;
if (_settings.Value.LuceneDirectoryFactory == LuceneDirectoryFactory.SyncedTempFileSystemDirectoryFactory)
{
// if this directory factory is enabled then a snapshot deletion policy is required
options.IndexDeletionPolicy = new SnapshotDeletionPolicy(new KeepOnlyLastCommitDeletionPolicy());
}
}
I have made several attempts, using different approaches, and it's always the same, it works for other letters (B,C,D...) but not for A.
for now I'm stuck :(
Even tried this (just for reference, not actual solution)
List<LetterCategorySearchResult> results = new();
using (var directory = FSDirectory.Open(new DirectoryInfo("umbraco\\Data\\TEMP\\ExamineIndexes\\AuthorsIndex")))
using (var analyzer = new NoStopWordsAnalyzer()) // Use your custom analyzer here
{
// Create a query parser
var queryParser = new QueryParser(LuceneVersion.LUCENE_48, "letterCategory", analyzer);
// Parse the query
Query query = queryParser.Parse(letter.ToString());
// Set up the searcher
using (var reader = DirectoryReader.Open(directory))
{
var searcher = new IndexSearcher(reader);
// Execute the search
TopDocs topDocs = searcher.Search(query, 9999999);
foreach (ScoreDoc scoreDoc in topDocs.ScoreDocs)
{
Document doc = searcher.Doc(scoreDoc.Doc);
LetterCategorySearchResult searchResult = new(doc, scoreDoc.Score);
results.Add(searchResult);
}
}
return results;
}
It also doesn't seem to matter the type of Analyzer I configure with the index, Umbraco always displays it as StandardAnalyzer. Does that mean the backoffice has a bug and it doesn't show the correct analyzer, OR Umbraco has a bug and it doesn't take the custom analyzer?
And also like this (with no success)
if (!string.IsNullOrEmpty(name) && name.Equals(AuthorsIndex.INDEX_NAME))
{
options.Analyzer = new StandardAnalyzer(LuceneVersion.LUCENE_48, new CharArraySet(LuceneVersion.LUCENE_48, 0, true));
options.FieldDefinitions = new FieldDefinitionCollection(
new FieldDefinition("name", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("birthPlace", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("city", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("professionId", FieldDefinitionTypes.FullTextSortable),
new FieldDefinition("nationalityId", FieldDefinitionTypes.FullTextSortable)
);
options.UnlockIndex = true;
if (_settings.Value.LuceneDirectoryFactory == LuceneDirectoryFactory.SyncedTempFileSystemDirectoryFactory)
{
// if this directory factory is enabled then a snapshot deletion policy is required
options.IndexDeletionPolicy = new SnapshotDeletionPolicy(new KeepOnlyLastCommitDeletionPolicy());
}
}
And the search
public IEnumerable<ISearchResult> GetAuthorsFromLetterCategory(char letter, out long totalItemCount)
{
totalItemCount = 0;
//https://our.umbraco.com/forum/using-umbraco-and-getting-started/114576-examine-index-search-returns-all-results-for-lettercategory-a
if (_examineManager.TryGetIndex(AuthorsIndex.INDEX_NAME, out var index))
{
var searchTerm = letter.ToString();
var analyzer = new StandardAnalyzer(LuceneVersion.LUCENE_48, new CharArraySet(LuceneVersion.LUCENE_48, 0, true));
var searcher = index.Searcher as LuceneSearcher;
var query = searcher!
.CreateQuery(null, BooleanOperation.Or, analyzer, new Examine.Lucene.Search.LuceneSearchOptions())
.Field("__IndexType", "author")
.And().Field("letterCategory", searchTerm)
.Execute();
totalItemCount = query.TotalItemCount;
if (query.Any())
{
return query;
}
}
return [];
}
Examine Index Search returns all results for letterCategory == 'A'
Hi there,
I have created an author Index with the following fields
But, when I try to search by letter category, 'A' returns ALL the authors, but 'B', 'C', 'D', etc all return the correct results. Is this a bug in my code?
HELP!! :-(
I belive that "A" is stop word and removed from query, you can change analyzer or change stopwords dictionary
Thank you Yakov Lebski,
I did find some suggestions online, but none has worked so far. Has anyone encountered this situation before?
Tried
I also tried creating a custom analyzer:
did register it with the custom index options:
and configured the custom options in a composer
Nothing is working so far, and the BackOffice is showing StandardAnalyzer instead of the NoStopWordsAnalyzer I just created.
Any documentation I could look into? I have no idea what is wrong :(
can you try
I tried with
new StandardAnalyzer(LuceneVersion.LUCENE_48, CharArraySet.EMPTY_SET);
and also tried like this:The results are retuing 0 now, instead of all of them, so at least something changed!
Is it possible that I'm setting up my index options incorrectly?
I have made several attempts, using different approaches, and it's always the same, it works for other letters (B,C,D...) but not for A.
for now I'm stuck :(
Even tried this (just for reference, not actual solution)
It also doesn't seem to matter the type of Analyzer I configure with the index, Umbraco always displays it as StandardAnalyzer. Does that mean the backoffice has a bug and it doesn't show the correct analyzer, OR Umbraco has a bug and it doesn't take the custom analyzer?
And also like this (with no success)
And the search
Ended up using this FUGLY hack.... just in case anyone needs it, or finds a solution
is working on a reply...