https: prefix and .aspx suffix being corrupting and breaking otherwise valid links
System: Umbraco Version - 4.7.1 ,Windows Server 2008 – 8gb memory – 4 Zeon 2.27Ghz processors service pack 1 – 64 bit OS , IIS version 7.5.7600.16385, SQL Server 2008 R2, Stacktrace - N/A
We have a large number of websites under a single Umbraco instance. Some are https:, but most are simply http: sites. We have a nasty problem which suspect is a very nasty bug in Umbraco.
Http: links on our site periodically converted/resolved improperly and inappropriately by Umbraco as follows with consequences that break part of our website and frustate users. We have had this problem ever since we moved to 4.71.:
http://bariatric.surgery.ucsf.edu/is translated as https://bariatric.surgery.ucsf.edu.aspx/ for no apparent reason. This of course causes the link to fail.
http://bariatric.surgery.ucsf.edu/about-us.aspx is translated as https://bariatric.surgery.ucsf.edu/about-us.aspx In this case, the link works because it was intended to have .aspx, but many browsers throw up mixed content and content not secure error messages as the underlying page has insecure content.
We are completely frustrated w/this problem, have probed error logs and have no idea why this is happening. We have created some brute force workarounds on a few popup menus laden w/a huge number of links, but the problem persists.
When we republish our global setting file, the problem often resolves, but we cannot be doing this all day and there is not way to trap for this problem.
When Doug Robar, an Umbraco MVP, was working on this problem for us a consultant, he devised a workaround that addressed this issue in a single heavily used pop-up menu on the site. I am reproducing the email transcript of our discussion below in the hopes it may shed light on the problem. I call this a "brute force" solution because it treats the symptoms but not the underlying problem. The issue still wreaks havoc periodically on other parts of our site.
From: Douglas Robar Sent: Tuesday, August 21, 2012 2:23 AM To: Tam, Raymond; Mayfield, Rob (HyperArts) Cc: Barg, Richard Subject: Re: How to monitor for XSLT and https/aspx errors
If I understand correctly, your monitoring system requests a url and then you search for the xslt error's text string in the html source returned from the request. Is it that simple? I certainly could be and would be effective.
If that's how you do it we can add some checks for the https:// and .aspx links in the Find A Program popup window too. The html markup for the Find A Program window is already present on every page and not loaded via javascript per se. Rather, when you click the link on the top brown menu bar a bit of javascript displays the popup window. The window is actually always there you just can't see it.
Which mean... you can search the html markup for anything from this bit of markup (taken from the site's homepage but I don't think it matters)
As you can see, there are https:// and .aspx links here, which is a problem and someone should be alerted to correct it. The non-subtle approach would be to search for
A more subtle approach would be to get the innerHtml of the
class
="findAProgramHeaderSystemWide"> tag and then look for any
tags with's with https and .aspx at the end of it. But I suspect the simple item above is sufficient for this interim period.
Another option would be to fix any https:// and .aspx links in the Find A Program markup before it ever gets sent to the visitor. This doesn't solve the underlying problem of the wrong URLs being generated by Umbraco but it should take care of the symptoms and give more breathing room for the site to have a higher quality of life for the interim period.
As best as I can tell from a quick look a simple addition to the code of the GetFindAProgramDropdown.xslt file can look for https:// urls and fix them before they are sent to the website visitor. In particular, I would suggest changing this code:
public void setLocalProgramLabSite(string str) { string ret = ""; if (str == null || str == "") return; string[] vals = str.Split(',');
foreach (string val in vals) { int iVal = getIntFromString(val); if (iVal > 0) { string url = library.NiceUrlFullPathUCSF(iVal); XPathNodeIterator currentNode = library.GetXmlNodeById(val); currentNode.MoveNext(); XPathNavigator clone = currentNode.Current; string nodeName = clone.GetAttribute("nodeName", ""); addToMenu(nodeName, url, 1); } } }
to:
public void setLocalProgramLabSite(string str) { string ret = ""; if (str == null || str == "") return; string[] vals = str.Split(',');
foreach (string val in vals) { int iVal = getIntFromString(val); if (iVal > 0) { string url = umbraco.library.NiceUrlWithDomain(iVal); // DJR: 21-Aug-2012 :: forcefully remove any errant https:// and .aspx portions of the url if (url.Contains("https://")) { url = url.Replace("https://", "http://"); url = url.Replace(".aspx", ""); } XPathNodeIterator currentNode = library.GetXmlNodeById(val); currentNode.MoveNext(); XPathNavigator clone = currentNode.Current; string nodeName = clone.GetAttribute("nodeName", ""); addToMenu(nodeName, url, 1); } } }
There may be other macros used to produce the same/similar output for the site that might need a similar modification. Or if this macro is used for more than the Find A Program popup window it may need some further nuancing of the condition logic in the "if()" statement to not remove .aspx when it should actually be there. You'll know better than I if this is an appropriate interim solution, if it needs to be more nuanced, and if similar changes might need to be put in other macros.
You'll also note that I changed out the custom library call with the built-in umbraco call to remove that dependency on unused custom code.
This is a post by Doug Robar regarding thoughts on why this might be happening specifcally in 4.71:
From:[email protected] [mailto:[email protected]] Sent: Tuesday, August 21, 2012 9:25 AM To: Douglas Robar Cc: Tam, Raymond; Barg, Richard Subject: Re: How to monitor for XSLT and https/aspx errors
Your code suggestion for the XSLT make sense as a temp fix.
Actually I am beginning to wonder if niceURLWithDomain is a problem. This was added in 4.7 (or maybe 4.5) and work like correct my NiceURLFullPathUCSF which is why I just call niceURLWithDomain from NiceURLFullPathUCSF. But outside of internal Umbraco code changes is the only change from UCSF's point of view.
There appears to be another happening with with Umbraco calling a XSLT with a incorrect path on occasion which is usually the first symptom of the system going south..
This is an email exhange with another developer experiencing a similar problems. Again no solution - just an exchange of ideas on why this is occuring:
Hi Jared,
Rob Mayfield here, I the developer for the UCSF Umbraco sites. What I did as a temporary fix for individual pages was to force pages that are supposed to be HTTP but show up as HTTPS back to HTTP with the following rewrite code in the ‘Page_Load’ event:
Often times but not always we see the problem (HTTP_>HTTPS) just after some has hit a secure HTTPS page and then someone comes along with a HTTP request - so caching may be the culprit.
I also put code in the menu XSLT which converts any menu item which has the incorrect ‘HTTPS’ signature back to HTTP and strip off any offending .aspx on the home pages. This is C# code inside the XSLT
This is a temporary solution but seems to help - your recycling the Application Pool is interesting and worked the one time I caught the problem on out Dev platform.
We are running Windows Server 2008-R2 - version 6.1 (Build 7601:servie pack 1) - with IIS 7.5.7600.16385 - the Application Pool is using the ‘Integrated Pipleline’ - Net framework v4.0
Ours are not adding .aspx - we are using directory urls and these seem to be okay so far. However they are switching to https. Like you say I am not sure this is triggered by publishing. It seems to be a bit more random.
On section of our site seems to be affected most which makes me think it is related to macro cacheing. I have taken all the cacheing off macro’s which seems to have helped. I haven’t checked this morning yet… This is also backed up by our solution which is to recycle app pool - this reverts all https pages back to http without any republishing.
We also have some valid https pages however these do not seem related to the issue.
Macro sound interesting - can you give me some details?
https: prefix and .aspx suffix being corrupting and breaking otherwise valid links
System: Umbraco Version - 4.7.1 ,Windows Server 2008 – 8gb memory – 4 Zeon 2.27Ghz processors service pack 1 – 64 bit OS , IIS version 7.5.7600.16385, SQL Server 2008 R2, Stacktrace - N/A
We have a large number of websites under a single Umbraco instance. Some are https:, but most are simply http: sites. We have a nasty problem which suspect is a very nasty bug in Umbraco.
(A similar issue w/https: prefixing was discussed at http://our.umbraco.org/forum/ourumb-dev-forum/bugs/26863-Changes-automatically-to-httpS-after-publish but our issue is different.)
Http: links on our site periodically converted/resolved improperly and inappropriately by Umbraco as follows with consequences that break part of our website and frustate users. We have had this problem ever since we moved to 4.71.:
http://bariatric.surgery.ucsf.edu/ is translated as https://bariatric.surgery.ucsf.edu.aspx/ for no apparent reason. This of course causes the link to fail.
http://bariatric.surgery.ucsf.edu/about-us.aspx is translated as https://bariatric.surgery.ucsf.edu/about-us.aspx
In this case, the link works because it was intended to have .aspx, but many browsers throw up mixed content and content not secure error messages as the underlying page has insecure content.
We are completely frustrated w/this problem, have probed error logs and have no idea why this is happening. We have created some brute force workarounds on a few popup menus laden w/a huge number of links, but the problem persists.
When we republish our global setting file, the problem often resolves, but we cannot be doing this all day and there is not way to trap for this problem.
When Doug Robar, an Umbraco MVP, was working on this problem for us a consultant, he devised a workaround that addressed this issue in a single heavily used pop-up menu on the site. I am reproducing the email transcript of our discussion below in the hopes it may shed light on the problem. I call this a "brute force" solution because it treats the symptoms but not the underlying problem. The issue still wreaks havoc periodically on other parts of our site.
From: Douglas Robar
Sent: Tuesday, August 21, 2012 2:23 AM
To: Tam, Raymond; Mayfield, Rob (HyperArts)
Cc: Barg, Richard
Subject: Re: How to monitor for XSLT and https/aspx errors
If I understand correctly, your monitoring system requests a url and then you search for the xslt error's text string in the html source returned from the request. Is it that simple? I certainly could be and would be effective.
If that's how you do it we can add some checks for the https:// and .aspx links in the Find A Program popup window too. The html markup for the Find A Program window is already present on every page and not loaded via javascript per se. Rather, when you click the link on the top brown menu bar a bit of javascript displays the popup window. The window is actually always there you just can't see it.
Which mean... you can search the html markup for anything from this bit of markup (taken from the site's homepage but I don't think it matters)
id="findAProgramLabSystemWide" style="display:block;width:700px;" xmlns:user="urn:my-scripts">
id="findAProgramCloseButton">href="javascript:doFindaProgramClose()">
src="/img/FindAProgramPopoutCloseButton.png" border="0" />
class="findAProgramHeaderSystemWide" style="">
class="findAProgramColumnDiv" style="">Department of Surgery Websites
class='findAProgramPopoutUL'>
class='findAProgramPopoutUL'>
As you can see, there are https:// and .aspx links here, which is a problem and someone should be alerted to correct it. The non-subtle approach would be to search for
A more subtle approach would be to get the innerHtml of the
="findAProgramHeaderSystemWide"> tag and then look for any
Another option would be to fix any https:// and .aspx links in the Find A Program markup before it ever gets sent to the visitor. This doesn't solve the underlying problem of the wrong URLs being generated by Umbraco but it should take care of the symptoms and give more breathing room for the site to have a higher quality of life for the interim period.
As best as I can tell from a quick look a simple addition to the code of the GetFindAProgramDropdown.xslt file can look for https:// urls and fix them before they are sent to the website visitor. In particular, I would suggest changing this code:
public void setLocalProgramLabSite(string str)
{
string ret = "";
if (str == null || str == "") return;
string[] vals = str.Split(',');
foreach (string val in vals)
{
int iVal = getIntFromString(val);
if (iVal > 0)
{
string url = library.NiceUrlFullPathUCSF(iVal);
XPathNodeIterator currentNode = library.GetXmlNodeById(val);
currentNode.MoveNext();
XPathNavigator clone = currentNode.Current;
string nodeName = clone.GetAttribute("nodeName", "");
addToMenu(nodeName, url, 1);
}
}
}
to:
public void setLocalProgramLabSite(string str)
{
string ret = "";
if (str == null || str == "") return;
string[] vals = str.Split(',');
foreach (string val in vals)
{
int iVal = getIntFromString(val);
if (iVal > 0)
{
string url = umbraco.library.NiceUrlWithDomain(iVal);
// DJR: 21-Aug-2012 :: forcefully remove any errant https:// and .aspx portions of the url
if (url.Contains("https://"))
{
url = url.Replace("https://", "http://");
url = url.Replace(".aspx", "");
}
XPathNodeIterator currentNode = library.GetXmlNodeById(val);
currentNode.MoveNext();
XPathNavigator clone = currentNode.Current;
string nodeName = clone.GetAttribute("nodeName", "");
addToMenu(nodeName, url, 1);
}
}
}
There may be other macros used to produce the same/similar output for the site that might need a similar modification. Or if this macro is used for more than the Find A Program popup window it may need some further nuancing of the condition logic in the "if()" statement to not remove .aspx when it should actually be there. You'll know better than I if this is an appropriate interim solution, if it needs to be more nuanced, and if similar changes might need to be put in other macros.
You'll also note that I changed out the custom library call with the built-in umbraco call to remove that dependency on unused custom code.
cheers,
doug.
This is a post by Doug Robar regarding thoughts on why this might be happening specifcally in 4.71:
From: [email protected] [mailto:[email protected]]
Sent: Tuesday, August 21, 2012 9:25 AM
To: Douglas Robar
Cc: Tam, Raymond; Barg, Richard
Subject: Re: How to monitor for XSLT and https/aspx errors
Your code suggestion for the XSLT make sense as a temp fix.
Actually I am beginning to wonder if niceURLWithDomain is a problem. This was added in 4.7 (or maybe 4.5) and work like correct my NiceURLFullPathUCSF which is why I just call niceURLWithDomain from NiceURLFullPathUCSF. But outside of internal Umbraco code changes is the only change from UCSF's point of view.
There appears to be another happening with with Umbraco calling a XSLT with a incorrect path on occasion which is usually the first symptom of the system going south..
This is an email exhange with another developer experiencing a similar problems. Again no solution - just an exchange of ideas on why this is occuring:
Hi Jared,
Rob Mayfield here, I the developer for the UCSF Umbraco sites. What I did as a temporary fix for individual pages was to force pages that are supposed to be HTTP but show up as HTTPS back to HTTP with the following rewrite code in the ‘Page_Load’ event:
if (Request.IsSecureConnection)
{
string redirectUrl = Request.Url.ToString().Replace("https:", "http:");
Response.Redirect(redirectUrl);
}
I do the opposite for HTTPS pages.
Often times but not always we see the problem (HTTP_>HTTPS) just after some has hit a secure HTTPS page and then someone comes along with a HTTP request - so caching may be the culprit.
I also put code in the menu XSLT which converts any menu item which has the incorrect ‘HTTPS’ signature back to HTTP and strip off any offending .aspx on the home pages. This is C# code inside the XSLT
string s1 = umbraco.library.NiceUrlWithDomain(ival);
if (s1.Contains("https://"))
{
s1 = s1.Replace("https://", "http://");
s1 = s1.Replace(".aspx", "");
}
This is a temporary solution but seems to help - your recycling the Application Pool is interesting and worked the one time I caught the problem on out Dev platform.
We are running Windows Server 2008-R2 - version 6.1 (Build 7601:servie pack 1) - with IIS 7.5.7600.16385 - the Application Pool is using the ‘Integrated Pipleline’ - Net framework v4.0
Umbraco 4.7.1
Thank you for any insights you can provide.
Rob
From: Jared Smith [<mailto:[email protected]>]
Sent: Sunday, November 18, 2012 2:09 PM
To: Barg, Richard
Cc: Mayfield, Rob (HyperArts)
Subject: RE: Jared Smith??
Hi Richard,
Yes, sounds similar.
Ours are not adding .aspx - we are using directory urls and these seem to be okay so far. However they are switching to https. Like you say I am not sure this is triggered by publishing. It seems to be a bit more random.
On section of our site seems to be affected most which makes me think it is related to macro cacheing. I have taken all the cacheing off macro’s which seems to have helped. I haven’t checked this morning yet… This is also backed up by our solution which is to recycle app pool - this reverts all https pages back to http without any republishing.
We also have some valid https pages however these do not seem related to the issue.
Macro sound interesting - can you give me some details?
Thanks
is working on a reply...