Not sure why that's happening. Here are some notes:
Your code isn't visible. You have to click the curly braces while highlighting your code in the forum editor to ensure it renders properly as a code sample.
You might want to try rebuilding the Examine index, or even deleting all of them in App_Data/TEMP/ExamineIndexes and then rebuilding again (just to be sure the data you are seeing isn't from an old indexing operation).
In case a third party integrates your open source application into a closed source application, he/she will have to procure a commercial license of iText.
Rebuild the index in the developer section (the Examine Management dashboard).
Ensure you have configured both the ExamineIndex.config and the ExamineSettings.config files.
Ensure you are looking at the correct index name (based on the documentation, that'd be "MediaIndexSet"). If you have search functionality built, it will have to be against this index.
PDF Search also bringing back jpgs
Hi,
I set up a search to search within PDFs, which is working:
However, it is also bring back images and pages. How do I restrict it to pdfs? The extensions above is set to pdfs,
Thanks a lot!
Not sure why that's happening. Here are some notes:
App_Data/TEMP/ExamineIndexes
and then rebuilding again (just to be sure the data you are seeing isn't from an old indexing operation).Thanks I will try.
What do you mean less costly in commercial usages?
The license section of the readme says this: https://github.com/umbraco/UmbracoExamine.PDF
An iText license costs several thousand dollars. See here: http://itextpdf.com/Pricing/unit-based
In contrast, ExamineFileIndexer uses Apache Tika, which appears to be free, as it's using the Apache license: https://tika.apache.org/license.html
And the Apache license seems to allow for commercial usage: https://tldrlegal.com/license/apache-license-2.0-(apache-2.0)
Hi,
I installed ExamineFileIndexer. However, it only seems to be indexing the pdf file names, and not the actual contents of the pdf?
I used the default configuration, described here: https://our.umbraco.org/projects/backoffice-extensions/examinefileindexer/
I also deleted the temp folder.
Do you know how to get it to index the contents of pdfs?
THanks,
Damon,
Can you log issue on github, also can you take a look at the umbraco log file any errors logged? Also how did you install it package or nuget?
Regards
Ismail
I've never used it before, so I couldn't really tell you what's wrong. I'd start by checking for errors in the Umbraco error log.
Also, you may want to submit a bug report here: https://github.com/thecogworks/examinefileindexer/issues
Some things to try first:
is working on a reply...