Clarification of which pages are included when link checker runs
Great tool. This is working really well for me. The output is easier to read than many tools similar to this. I like it that I can specify starting node. It would be nice to include the number of pages to check (pageSize) in the UI.
I'm trying to be sure I understand how the link checker works.
It looks like it does not "crawl" the site but does actually checks all of the pages in the site. So if I have some pages that are not visible (umbracoNaviHide = true), and there are no other pages that link to them, they will still be checked by this link checker. Is that correct?
Also, if a page is an existing page that is currently unpublished, will the link checker check that page?
If a page is a new page that has never been published, will the link checker check that page?
You are correct, the checker doesn't "crawl" the site in the traditional sense. Basically, what I do is use the Umbraco API to generate a list of all published nodes in the site. The code then makes an HTTP request to the page's URL and then parses out all the links in the page and then checks each one.
So, it will check all published pages, even if they have umbracoNaviHide = true or aren't linked to from anywhere else. It won't, though, check pages that are not published (though if it has been previously published it will check the current version). If a page has never been published it won't be checked.
Clarification of which pages are included when link checker runs
Great tool. This is working really well for me. The output is easier to read than many tools similar to this. I like it that I can specify starting node. It would be nice to include the number of pages to check (pageSize) in the UI.
I'm trying to be sure I understand how the link checker works.
It looks like it does not "crawl" the site but does actually checks all of the pages in the site. So if I have some pages that are not visible (umbracoNaviHide = true), and there are no other pages that link to them, they will still be checked by this link checker. Is that correct?
Also, if a page is an existing page that is currently unpublished, will the link checker check that page?
If a page is a new page that has never been published, will the link checker check that page?
Hi Janet,
You are correct, the checker doesn't "crawl" the site in the traditional sense. Basically, what I do is use the Umbraco API to generate a list of all published nodes in the site. The code then makes an HTTP request to the page's URL and then parses out all the links in the page and then checks each one.
So, it will check all published pages, even if they have umbracoNaviHide = true or aren't linked to from anywhere else. It won't, though, check pages that are not published (though if it has been previously published it will check the current version). If a page has never been published it won't be checked.
Hope that answers your question?
That makes sense. Thank you for the explanation.
is working on a reply...