Looking at the index content, I see there are fields with a prefix of _RAW that do actually contain the real content (of the RTE). Do I need to include those in the index? Is there another, better way?
Alternatively, would I be better off using the API and doing something like
var nodes = HomePage.Descendants().where(...)
considering that there are so many nodes to search through?
Btw, this is for an admin report, not content on the site itself.
The _RAW is not tokenised only stored therefore not searchable. However you could use gatheringnode data event and inject into new field the field content encoded although the url will not be present however the localLink code will. I did something like this ages ago was for old version of umbraco see https://our.umbraco.org/projects/backoffice-extensions/cogworks-cogitemusage/ so it was finding item usage in content including links.
I will try and dig out source code but in meantime download and extract page and reflect into dlls (DO NOT INSTALL THE PACKAGE ITS VERY OLD THINK V4 compatible only).
I did think about creating my own field to concatenate all of the possible content fields (due to the project history, there are LOTS of fields).
However, maybe I could do the processing at the gatheringnodedata event and basically create a new field containing a list of outward links (node Ids) for the node. Then the Examine search would be nice and easy :-)
This is what I did when i created that data type and it works albeit for very old Umbraco version however in theory you are looking todo something very similar.
Examine out of the box will not give you everything, however you have extensiblity points like gatheringnode and documentwriting and this is where the magic happens you can extend and bend examine to your will!
Search for links in content with Examine
I need to search all content (2000+ nodes?) for links to a specific node. The content field is a Rich Text Editor (TinyMCE).
I have found that the standard internal index strips out the HTML, leaving only the content - therefore the "localLink" value is not there:
Looking at the index content, I see there are fields with a prefix of _RAW that do actually contain the real content (of the RTE). Do I need to include those in the index? Is there another, better way?
Alternatively, would I be better off using the API and doing something like
considering that there are so many nodes to search through?
Btw, this is for an admin report, not content on the site itself.
Gordon,
The _RAW is not tokenised only stored therefore not searchable. However you could use gatheringnode data event and inject into new field the field content encoded although the url will not be present however the localLink code will. I did something like this ages ago was for old version of umbraco see https://our.umbraco.org/projects/backoffice-extensions/cogworks-cogitemusage/ so it was finding item usage in content including links.
I will try and dig out source code but in meantime download and extract page and reflect into dlls (DO NOT INSTALL THE PACKAGE ITS VERY OLD THINK V4 compatible only).
Regards
Ismail
It appears as though Examine is not going to be helpful in performing this type of search (finding links), which seems odd!?
Can anyone show me a way that I can do it? Or should I abandon the idea of using Examine and go with "node.descendants()"?
Gordon,
You can has source https://github.com/ismailmayat/cogitemusage
;-}
Thanks Ismail, I will take a look.
I did think about creating my own field to concatenate all of the possible content fields (due to the project history, there are LOTS of fields).
However, maybe I could do the processing at the gatheringnodedata event and basically create a new field containing a list of outward links (node Ids) for the node. Then the Examine search would be nice and easy :-)
Gordon,
This is what I did when i created that data type and it works albeit for very old Umbraco version however in theory you are looking todo something very similar.
Examine out of the box will not give you everything, however you have extensiblity points like gatheringnode and documentwriting and this is where the magic happens you can extend and bend examine to your will!
Regards
Ismail
is working on a reply...