Press Ctrl / CMD + C to copy this to your clipboard.
This post will be reported to the moderators as potential spam to be looked at
I am on Umbraco 7.5.9.
I did not find a NuGet package for MS Office documents (Word & Excel). The only one I found was for PDF.
Does anyone have a C# example of how to index\search office documents?
You can use https://our.umbraco.com/packages/backoffice-extensions/examinefileindexer/ it uses apache tika under the hood and will index office documents.
With regards to search are you looking to do combined search with content and media or just media? When you install the package it will create indexer and searcher for you and you can use that searcher but it will only be on the media.
If you want combined search you will have to look up how to do multi index search.
OK that worked. YOU rock.
One more question.
Do you know if there is a way to extract PDF meta data into search engine?
I would like to add PDF Title, Author, description, create date to indexer.
It should by default extract it, check the index with Luke you should see all associated meta data.
Thank you . It does. I appreciate your comments.
Got another question.
I uploaded a pdf to my anonymous site and a different pdf to my secure site's media folder.
On my secure site when users login using Active Directory and OWIN, I am not geting any search results for my new pdf document but I am getting search results for the pdf in question for the anonymous site.
Would you have any suggestions on why this is so?
Note: I am using the following nuget packages.
How are you securing the pdfs in media section you using media protect? Also try searching for the item in the index itself using luke. Also not sure why you are using both packages as my one also does pdf.
Thanks. I figured it out. In the ExamineIndex.config file I have IndexSets defined. There is an
So thanks for responding.
Did you know that Umbraco Examine does not index any content in Macros?
If you know of a way to do this, please let me know.
Yup I did and I screen scrap in gathering node to get rendered content
"Screen Scrap". Can you share the code?
is working on a reply...
Write your reply to:
Image will be uploaded when post is submitted