We have a site that has this setup with three servers:
edit.mysite.com (private access only)
www.mysite.com (public access, two web fronts)
We have a SSL cert installed in the load balancer and redirect the traffic on port 80 to the internal webservers.
When resolving canonical/absolute urls we have a problem.
On the startpage of the site we get the internal IP (10.10.10.10/), on the web front we are accessing, as absolute URL but when we access a subpage we get the correct absolute url (www.mysite.com/asdf). If we set a hostname on the site in Umbraco we only get absolute urls and not any relative.
As the traffic is https to the load balancer and port 80 internally we also get all the absolute urls as http. Whats the best practice in this scenario for absolute urls and media urls? do we use the umbracoUseSSL=true setting?
We kinda had the same problems with a website we released not so long ago. Our loadbalancer receives the traffic via https, but passes the traffic to the website with http. This caused the website to not know which scheme to use. Our solution was a custom default url provider. It replaced the default Umbraco url provider (https://github.com/umbraco/Umbraco-CMS/blob/75c2b07ad3a093b5b65b6ebd45697687c062f62a/src/Umbraco.Web/Routing/DefaultUrlProvider.cs) with almost exactly the same code and added a custom url provider that determined the scheme based on specific headers loadbalancers usually send.
The way we currently determine if a connection is secure:
bool isHttps = false;
if (!String.IsNullOrEmpty(request.ServerVariables["HTTP_X_FORWARDED_PROTO"])) // Standarized way for the loadbalancer to let us know this is https
isHttps = request.ServerVariables["HTTP_X_FORWARDED_PROTO"].ToLower() == "https"; // this says "https"
else if (!String.IsNullOrEmpty(request.Headers.Get("Front-End-Https"))) // Some other loadbalancers use this key
isHttps = request.Headers.Get("Front-End-Https") == "on"; // this one says "on"
else if (!String.IsNullOrEmpty(request.Headers.Get("X-Forwarded-Proto"))) // Some other loadbalancers use this key
isHttps = request.Headers.Get("X-Forwarded-Proto").ToLower() == "https"; // this one says "https"
else if (!String.IsNullOrEmpty(request.Headers.Get("x-https-session"))) // Some other loadbalancers use this key
isHttps = request.Headers.Get("x-https-session").ToLower() == "yes"; // this one says "yes"
else // Default .net logic
isHttps = request.IsSecureConnection;
To remove the default url provider, place this in your global.asax:
/// <summary>
/// Provides urls.
/// </summary>
public class CustomDefaultUrlProvider : IUrlProvider
{
private readonly IRequestHandlerSection _requestSettings;
[Obsolete("Use the ctor that specifies the IRequestHandlerSection")]
public CustomDefaultUrlProvider()
: this(UmbracoConfig.For.UmbracoSettings().RequestHandler)
{
}
public CustomDefaultUrlProvider(IRequestHandlerSection requestSettings)
{
_requestSettings = requestSettings;
}
#region GetUrl
/// <summary>
/// Gets the nice url of a published content.
/// </summary>
/// <param name="umbracoContext">The Umbraco context.</param>
/// <param name="id">The published content id.</param>
/// <param name="current">The current absolute url.</param>
/// <param name="mode">The url mode.</param>
/// <returns>The url for the published content.</returns>
/// <remarks>
/// <para>The url is absolute or relative depending on <c>mode</c> and on <c>current</c>.</para>
/// <para>If the provider is unable to provide a url, it should return <c>null</c>.</para>
/// </remarks>
public virtual string GetUrl(UmbracoContext umbracoContext, int id, Uri current, UrlProviderMode mode)
{
if (!current.IsAbsoluteUri)
throw new ArgumentException("Current url must be absolute.", "current");
// will not use cache if previewing
var route = umbracoContext.ContentCache.GetRouteById(id);
if (string.IsNullOrWhiteSpace(route))
{
LogHelper.Debug<DefaultUrlProvider>(
"Couldn't find any page with nodeId={0}. This is most likely caused by the page not being published.",
() => id);
return null;
}
var domainHelper = new CustomDomainHelper(umbracoContext.Application.Services.DomainService);
// extract domainUri and path
// route is /<path> or <domainRootId>/<path>
var pos = route.IndexOf('/');
var path = pos == 0 ? route : route.Substring(pos);
var domainUri = pos == 0
? null
: domainHelper.DomainForNode(int.Parse(route.Substring(0, pos)), current, DetermineScheme(current, umbracoContext.HttpContext.Request));
// assemble the url from domainUri (maybe null) and path
return AssembleUrl(domainUri, path, current, mode).ToString();
}
#endregion
#region GetOtherUrls
/// <summary>
/// Gets the other urls of a published content.
/// </summary>
/// <param name="umbracoContext">The Umbraco context.</param>
/// <param name="id">The published content id.</param>
/// <param name="current">The current absolute url.</param>
/// <returns>The other urls for the published content.</returns>
/// <remarks>
/// <para>Other urls are those that <c>GetUrl</c> would not return in the current context, but would be valid
/// urls for the node in other contexts (different domain for current request, umbracoUrlAlias...).</para>
/// </remarks>
public virtual IEnumerable<string> GetOtherUrls(UmbracoContext umbracoContext, int id, Uri current)
{
// will not use cache if previewing
var route = umbracoContext.ContentCache.GetRouteById(id);
if (string.IsNullOrWhiteSpace(route))
{
LogHelper.Debug<DefaultUrlProvider>(
"Couldn't find any page with nodeId={0}. This is most likely caused by the page not being published.",
() => id);
return null;
}
var domainHelper = new CustomDomainHelper(umbracoContext.Application.Services.DomainService);
// extract domainUri and path
// route is /<path> or <domainRootId>/<path>
var pos = route.IndexOf('/');
var path = pos == 0 ? route : route.Substring(pos);
var domainUris = pos == 0 ? null : domainHelper.DomainsForNode(int.Parse(route.Substring(0, pos)), current, true /* default */, DetermineScheme(current, umbracoContext.HttpContext.Request));
// assemble the alternate urls from domainUris (maybe empty) and path
return AssembleUrls(domainUris, path).Select(uri => uri.ToString());
}
#endregion
#region Utilities
Uri AssembleUrl(DomainAndUri domainUri, string path, Uri current, UrlProviderMode mode)
{
Uri uri;
// ignore vdir at that point, UriFromUmbraco will do it
if (mode == UrlProviderMode.AutoLegacy)
{
mode = _requestSettings.UseDomainPrefixes
? UrlProviderMode.Absolute
: UrlProviderMode.Auto;
}
if (domainUri == null) // no domain was found
{
if (current == null)
mode = UrlProviderMode.Relative; // best we can do
switch (mode)
{
case UrlProviderMode.Absolute:
uri = new Uri(current.GetLeftPart(UriPartial.Authority) + path);
break;
case UrlProviderMode.Relative:
case UrlProviderMode.Auto:
uri = new Uri(path, UriKind.Relative);
break;
default:
throw new ArgumentOutOfRangeException("mode");
}
}
else // a domain was found
{
if (mode == UrlProviderMode.Auto)
{
if (current != null && domainUri.Uri.GetLeftPart(UriPartial.Authority) == current.GetLeftPart(UriPartial.Authority))
mode = UrlProviderMode.Relative;
else
mode = UrlProviderMode.Absolute;
}
switch (mode)
{
case UrlProviderMode.Absolute:
uri = new Uri(CombinePaths(domainUri.Uri.GetLeftPart(UriPartial.Path), path));
break;
case UrlProviderMode.Relative:
uri = new Uri(CombinePaths(domainUri.Uri.AbsolutePath, path), UriKind.Relative);
break;
default:
throw new ArgumentOutOfRangeException("mode");
}
}
// UriFromUmbraco will handle vdir
// meaning it will add vdir into domain urls too!
return UriUtility.UriFromUmbraco(uri);
}
string CombinePaths(string path1, string path2)
{
string path = path1.TrimEnd('/') + path2;
return path == "/" ? path : path.TrimEnd('/');
}
// always build absolute urls unless we really cannot
IEnumerable<Uri> AssembleUrls(IEnumerable<DomainAndUri> domainUris, string path)
{
// no domain == no "other" url
if (domainUris == null)
return Enumerable.Empty<Uri>();
// if no domain was found and then we have no "other" url
// else return absolute urls, ignoring vdir at that point
var uris = domainUris.Select(domainUri => new Uri(CombinePaths(domainUri.Uri.GetLeftPart(UriPartial.Path), path)));
// UriFromUmbraco will handle vdir
// meaning it will add vdir into domain urls too!
return uris.Select(UriUtility.UriFromUmbraco);
}
#endregion
internal string DetermineScheme(Uri current, HttpRequestBase request)
{
bool isHttps = false;
if (!String.IsNullOrEmpty(request.ServerVariables["HTTP_X_FORWARDED_PROTO"])) // Standarized way for the loadbalancer to let us know this is https
isHttps = request.ServerVariables["HTTP_X_FORWARDED_PROTO"].ToLower() == "https"; // this says "https"
else if (!String.IsNullOrEmpty(request.Headers.Get("Front-End-Https"))) // Some other loadbalancers use this key
isHttps = request.Headers.Get("Front-End-Https") == "on"; // this one says "on"
else if (!String.IsNullOrEmpty(request.Headers.Get("X-Forwarded-Proto"))) // Some other loadbalancers use this key
isHttps = request.Headers.Get("X-Forwarded-Proto").ToLower() == "https"; // this one says "https"
else if (!String.IsNullOrEmpty(request.Headers.Get("x-https-session"))) // Some other loadbalancers use this key
isHttps = request.Headers.Get("x-https-session").ToLower() == "yes"; // this one says "yes"
else // Default .net logic
isHttps = request.IsSecureConnection;
// Default
var scheme = Uri.UriSchemeHttp;
if (current != null)
{
if (isHttps)
scheme = Uri.UriSchemeHttps;
else
scheme = current.Scheme;
}
return scheme;
}
}
And the domain helper (which also needs some updates):
/// <summary>
/// Provides utilities to handle domains.
/// </summary>
public class CustomDomainHelper
{
private readonly IDomainService _domainService;
[Obsolete("Use the contructor specifying all dependencies instead")]
public CustomDomainHelper()
: this(ApplicationContext.Current.Services.DomainService)
{
}
public CustomDomainHelper(IDomainService domainService)
{
_domainService = domainService;
}
#region Domain for NodeCustomDomainHelper
/// <summary>
/// Finds the domain for the specified node, if any, that best matches a specified uri.
/// </summary>
/// <param name="nodeId">The node identifier.</param>
/// <param name="current">The uri, or null.</param>
/// <returns>The domain and its uri, if any, that best matches the specified uri, else null.</returns>
/// <remarks>If at least a domain is set on the node then the method returns the domain that
/// best matches the specified uri, else it returns null.</remarks>
internal DomainAndUri DomainForNode(int nodeId, Uri current, string overrideSchema = null)
{
// be safe
if (nodeId <= 0)
return null;
// get the domains on that node
var domains = _domainService.GetAssignedDomains(nodeId, false).ToArray();
// none?
if (domains.Any() == false)
return null;
// else filter
var helper = SiteDomainHelperResolver.Current.Helper;
var domainAndUri = DomainForUri(domains, current, domainAndUris => helper.MapDomain(current, domainAndUris), overrideSchema);
if (domainAndUri == null)
throw new Exception("DomainForUri returned null.");
return domainAndUri;
}
/// <summary>
/// Gets a value indicating whether a specified node has domains.
/// </summary>
/// <param name="nodeId">The node identifier.</param>
/// <returns>True if the node has domains, else false.</returns>
internal bool NodeHasDomains(int nodeId)
{
return nodeId > 0 && _domainService.GetAssignedDomains(nodeId, false).Any();
}
/// <summary>
/// Find the domains for the specified node, if any, that match a specified uri.
/// </summary>
/// <param name="nodeId">The node identifier.</param>
/// <param name="current">The uri, or null.</param>
/// <param name="excludeDefault">A value indicating whether to exclude the current/default domain. True by default.</param>
/// <returns>The domains and their uris, that match the specified uri, else null.</returns>
/// <remarks>If at least a domain is set on the node then the method returns the domains that
/// best match the specified uri, else it returns null.</remarks>
internal IEnumerable<DomainAndUri> DomainsForNode(int nodeId, Uri current, bool excludeDefault = true, string overrideSchema = null)
{
// be safe
if (nodeId <= 0)
return null;
// get the domains on that node
var domains = _domainService.GetAssignedDomains(nodeId, false).ToArray();
// none?
if (domains.Any() == false)
return null;
// get the domains and their uris
var domainAndUris = DomainsForUri(domains, current, overrideSchema).ToArray();
// filter
var helper = SiteDomainHelperResolver.Current.Helper;
return helper.MapDomains(current, domainAndUris, excludeDefault).ToArray();
}
#endregion
#region Domain for Uri
/// <summary>
/// Finds the domain that best matches a specified uri, into a group of domains.
/// </summary>
/// <param name="domains">The group of domains.</param>
/// <param name="current">The uri, or null.</param>
/// <param name="filter">A function to filter the list of domains, if more than one applies, or <c>null</c>.</param>
/// <returns>The domain and its normalized uri, that best matches the specified uri.</returns>
/// <remarks>
/// <para>If more than one domain matches, then the <paramref name="filter"/> function is used to pick
/// the right one, unless it is <c>null</c>, in which case the method returns <c>null</c>.</para>
/// <para>The filter, if any, will be called only with a non-empty argument, and _must_ return something.</para>
/// </remarks>
internal static DomainAndUri DomainForUri(IEnumerable<IDomain> domains, Uri current, Func<DomainAndUri[], DomainAndUri> filter = null, string overrideSchema = null)
{
// sanitize the list to have proper uris for comparison (scheme, path end with /)
// we need to end with / because example.com/foo cannot match example.com/foobar
// we need to order so example.com/foo matches before example.com/
var scheme = current == null ? Uri.UriSchemeHttp : current.Scheme;
// TH 18-02-2016: Override if set
if (!String.IsNullOrEmpty(overrideSchema))
scheme = overrideSchema;
var domainsAndUris = domains
.Where(d => d.IsWildcard == false)
.Select(SanitizeForBackwardCompatibility)
.Select(d => new DomainAndUri(d, scheme))
.OrderByDescending(d => d.Uri.ToString())
.ToArray();
if (domainsAndUris.Any() == false)
return null;
DomainAndUri domainAndUri;
if (current == null)
{
// take the first one by default (what else can we do?)
domainAndUri = domainsAndUris.First(); // .First() protected by .Any() above
}
else
{
// look for the first domain that would be the base of the current url
// ie current is www.example.com/foo/bar, look for domain www.example.com
var currentWithSlash = current.EndPathWithSlash();
domainAndUri = domainsAndUris
.FirstOrDefault(d => d.Uri.EndPathWithSlash().IsBaseOf(currentWithSlash));
if (domainAndUri != null) return domainAndUri;
// if none matches, try again without the port
// ie current is www.example.com:1234/foo/bar, look for domain www.example.com
domainAndUri = domainsAndUris
.FirstOrDefault(d => d.Uri.EndPathWithSlash().IsBaseOf(currentWithSlash.WithoutPort()));
if (domainAndUri != null) return domainAndUri;
// if none matches, then try to run the filter to pick a domain
if (filter != null)
{
domainAndUri = filter(domainsAndUris);
// if still nothing, pick the first one?
// no: move that constraint to the filter, but check
if (domainAndUri == null)
throw new InvalidOperationException("The filter returned null.");
}
}
return domainAndUri;
}
/// <summary>
/// Gets the domains that match a specified uri, into a group of domains.
/// </summary>
/// <param name="domains">The group of domains.</param>
/// <param name="current">The uri, or null.</param>
/// <returns>The domains and their normalized uris, that match the specified uri.</returns>
internal static IEnumerable<DomainAndUri> DomainsForUri(IEnumerable<IDomain> domains, Uri current, string overrideSchema = null)
{
var scheme = current == null ? Uri.UriSchemeHttp : current.Scheme;
// TH 18-02-2016: Override if set
if (!String.IsNullOrEmpty(overrideSchema))
scheme = overrideSchema;
return domains
.Where(d => d.IsWildcard == false)
.Select(SanitizeForBackwardCompatibility)
.Select(d => new DomainAndUri(d, scheme))
.OrderByDescending(d => d.Uri.ToString());
}
#endregion
#region Utilities
/// <summary>
/// Sanitize a Domain.
/// </summary>
/// <param name="domain">The Domain to sanitize.</param>
/// <returns>The sanitized domain.</returns>
/// <remarks>This is a _really_ nasty one that should be removed at some point. Some people were
/// using hostnames such as "/en" which happened to work pre-4.10 but really make no sense at
/// all... and 4.10 throws on them, so here we just try to find a way so 4.11 does not throw.
/// But really... no.</remarks>
private static IDomain SanitizeForBackwardCompatibility(IDomain domain)
{
var context = System.Web.HttpContext.Current;
if (context != null && domain.DomainName.StartsWith("/"))
{
// turn "/en" into "http://whatever.com/en" so it becomes a parseable uri
var authority = context.Request.Url.GetLeftPart(UriPartial.Authority);
domain.DomainName = authority + domain.DomainName;
}
return domain;
}
/// <summary>
/// Gets a value indicating whether there is another domain defined down in the path to a node under the current domain's root node.
/// </summary>
/// <param name="domains">The domains.</param>
/// <param name="path">The path to a node under the current domain's root node eg '-1,1234,5678'.</param>
/// <param name="rootNodeId">The current domain root node identifier, or null.</param>
/// <returns>A value indicating if there is another domain defined down in the path.</returns>
/// <remarks>Looks _under_ rootNodeId but not _at_ rootNodeId.</remarks>
internal static bool ExistsDomainInPath(IEnumerable<IDomain> domains, string path, int? rootNodeId)
{
return FindDomainInPath(domains, path, rootNodeId) != null;
}
/// <summary>
/// Gets the deepest non-wildcard Domain, if any, from a group of Domains, in a node path.
/// </summary>
/// <param name="domains">The domains.</param>
/// <param name="path">The node path eg '-1,1234,5678'.</param>
/// <param name="rootNodeId">The current domain root node identifier, or null.</param>
/// <returns>The deepest non-wildcard Domain in the path, or null.</returns>
/// <remarks>Looks _under_ rootNodeId but not _at_ rootNodeId.</remarks>
internal static IDomain FindDomainInPath(IEnumerable<IDomain> domains, string path, int? rootNodeId)
{
var stopNodeId = rootNodeId ?? -1;
return path.Split(',')
.Reverse()
.Select(int.Parse)
.TakeWhile(id => id != stopNodeId)
.Select(id => domains.FirstOrDefault(d => d.RootContentId == id && d.IsWildcard == false))
.SkipWhile(domain => domain == null)
.FirstOrDefault();
}
/// <summary>
/// Gets the deepest wildcard Domain, if any, from a group of Domains, in a node path.
/// </summary>
/// <param name="domains">The domains.</param>
/// <param name="path">The node path eg '-1,1234,5678'.</param>
/// <param name="rootNodeId">The current domain root node identifier, or null.</param>
/// <returns>The deepest wildcard Domain in the path, or null.</returns>
/// <remarks>Looks _under_ rootNodeId but not _at_ rootNodeId.</remarks>
internal static IDomain FindWildcardDomainInPath(IEnumerable<IDomain> domains, string path, int? rootNodeId)
{
var stopNodeId = rootNodeId ?? -1;
return path.Split(',')
.Reverse()
.Select(int.Parse)
.TakeWhile(id => id != stopNodeId)
.Select(id => domains.FirstOrDefault(d => d.RootContentId == id && d.IsWildcard))
.FirstOrDefault(domain => domain != null);
}
/// <summary>
/// Returns the part of a path relative to the uri of a domain.
/// </summary>
/// <param name="domainUri">The normalized uri of the domain.</param>
/// <param name="path">The full path of the uri.</param>
/// <returns>The path part relative to the uri of the domain.</returns>
/// <remarks>Eg the relative part of <c>/foo/bar/nil</c> to domain <c>example.com/foo</c> is <c>/bar/nil</c>.</remarks>
public static string PathRelativeToDomain(Uri domainUri, string path)
{
return path.Substring(domainUri.AbsolutePath.Length).EnsureStartsWith('/');
}
#endregion
}
Hope this helps!
Ps. if you cache the output of the page, and go to the website directly (skipping the loadbalancer) then the url's on the page will have "http" instead of "https" for all users, regardless if they came through the loadbalancer (because cached). Keep this in mind while testing :-).
Ps2. umbracoUseSSL only works for the backend.
Ps3. Don't use a scheme in the Umbraco hostname section of the node, just use www.domain.com.
Load balanced environment problems
Hi,
We have a site that has this setup with three servers:
We have a SSL cert installed in the load balancer and redirect the traffic on port 80 to the internal webservers.
When resolving canonical/absolute urls we have a problem.
On the startpage of the site we get the internal IP (10.10.10.10/), on the web front we are accessing, as absolute URL but when we access a subpage we get the correct absolute url (www.mysite.com/asdf). If we set a hostname on the site in Umbraco we only get absolute urls and not any relative.
As the traffic is https to the load balancer and port 80 internally we also get all the absolute urls as http. Whats the best practice in this scenario for absolute urls and media urls? do we use the umbracoUseSSL=true setting?
Regards Martin
Hi Martin,
We kinda had the same problems with a website we released not so long ago. Our loadbalancer receives the traffic via https, but passes the traffic to the website with http. This caused the website to not know which scheme to use. Our solution was a custom default url provider. It replaced the default Umbraco url provider (https://github.com/umbraco/Umbraco-CMS/blob/75c2b07ad3a093b5b65b6ebd45697687c062f62a/src/Umbraco.Web/Routing/DefaultUrlProvider.cs) with almost exactly the same code and added a custom url provider that determined the scheme based on specific headers loadbalancers usually send.
The way we currently determine if a connection is secure:
To remove the default url provider, place this in your global.asax:
Then add the custom url provider to the list before another
The complete code for our url provider is:
And the domain helper (which also needs some updates):
Hope this helps!
Ps. if you cache the output of the page, and go to the website directly (skipping the loadbalancer) then the url's on the page will have "http" instead of "https" for all users, regardless if they came through the loadbalancer (because cached). Keep this in mind while testing :-).
Ps2. umbracoUseSSL only works for the backend.
Ps3. Don't use a scheme in the Umbraco hostname section of the node, just use www.domain.com.
Hi Timo,
Wow, thanks. I will give this a try and report back when I've had the chance to deploy it :)
Thanks for all the input!
/Martin
Hey,
This seems to work really well for me. Is there a reason you didn't submit it to core?
Definitely worth doing.
Greg
is working on a reply...