umbrcoexamie for office documents

Press Ctrl / CMD + C to copy this to your clipboard.

Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at

Tom 161 posts 322 karma points

Oct 09, 2018 @ 17:29

0

UmbrcoExamie for Office Documents

I am on Umbraco 7.5.9. I did not find a NuGet package for MS Office documents (Word & Excel). The only one I found was for PDF.

Does anyone have a C# example of how to index\search office documents?

Thanks

Tom

Copy Link
Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib

Oct 11, 2018 @ 07:22

2

Tom,

You can use https://our.umbraco.com/packages/backoffice-extensions/examinefileindexer/ it uses apache tika under the hood and will index office documents.

With regards to search are you looking to do combined search with content and media or just media? When you install the package it will create indexer and searcher for you and you can use that searcher but it will only be on the media.

If you want combined search you will have to look up how to do multi index search.

Regards

Ismail

Copy Link
Tom 161 posts 322 karma points

Oct 12, 2018 @ 13:24

0

OK that worked. YOU rock.

One more question.

Do you know if there is a way to extract PDF meta data into search engine? I would like to add PDF Title, Author, description, create date to indexer.

Thanks

Tom

Copy Link
Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib

Oct 12, 2018 @ 22:20

1

Tom,

It should by default extract it, check the index with Luke you should see all associated meta data.

Regards

Ismail

Copy Link
Tom 161 posts 322 karma points

Oct 25, 2018 @ 16:42

0

Thank you . It does. I appreciate your comments.

Got another question. I uploaded a pdf to my anonymous site and a different pdf to my secure site's media folder. On my secure site when users login using Active Directory and OWIN, I am not geting any search results for my new pdf document but I am getting search results for the pdf in question for the anonymous site.

Would you have any suggestions on why this is so? Note: I am using the following nuget packages. Cogworks.ExamineFileIndexer UmbracoCms.UmbracoExamine.PDF

Thanks

Copy Link
Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib

Oct 25, 2018 @ 16:45

0

Tom,

How are you securing the pdfs in media section you using media protect? Also try searching for the item in the index itself using luke. Also not sure why you are using both packages as my one also does pdf.

Regards

Ismail

Copy Link
Tom 161 posts 322 karma points

Oct 26, 2018 @ 11:51

0

Ismail:

Thanks. I figured it out. In the ExamineIndex.config file I have IndexSets defined. There is an

So thanks for responding.

Copy Link
Tom 161 posts 322 karma points

Oct 26, 2018 @ 11:52

0

Ismail:

Did you know that Umbraco Examine does not index any content in Macros?

If you know of a way to do this, please let me know.

Thanks

Tom

Copy Link
Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib

Oct 26, 2018 @ 12:09

2

Yup I did and I screen scrap in gathering node to get rendered content

Copy Link
Tom 161 posts 322 karma points

Oct 26, 2018 @ 12:21

0

"Screen Scrap". Can you share the code?

Copy Link
is working on a reply...

This forum is in read-only mode while we transition to the new forum.

You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Please Sign in or register to post replies

Flag this post as spam?

UmbrcoExamie for Office Documents