Rick Strahl's Weblog  

Wind, waves, code and everything in between...
.NET • C# • Markdown • WPF • All Things Web
Contact   •   Articles   •   Products   •   Support   •   Advertise
Sponsored by:
West Wind WebSurge - Rest Client and Http Load Testing for Windows

Routes, Extensionless Paths and UrlEncoding in ASP.NET


:P
On this page:

UrlEncoding in Web applications can be a pain and in .NET, with its various utilities that all behave slightly differently for various edge cases, doesn't make it any easier. I wrote about the pain of UrlEncoding in .NET before. The resolution of that previous post was that Uri.EscapeDataString() is as close as it gets to a best solution out of the box.

But even with that knowledge I ran into trouble again with this topic, this time with URL paths created as part of an update to an old WebForms application and adding routing in order to provide cleaner URLs for accessing the product pages and categories. Here I'm not actually encoding query string parameters or post values, but instead encoding path segments on URL routes.

Essentially, I created  extensionless URLs for a few select URLs of the application. When encoding extensionless URLs extra care has to be given to properly encoding path strings, as paths are more sensitive to special rules that determine how the paths are parsed by the Web server and ASP.NET.

Routing 101 in Web Forms

The process of adding routing features to an old Web Forms application is pretty straight forward and using a few MapPageRoute() calls make short work of this process.

In this case I'm routing urls in my Web Store by mapping out products and categories like this (fired off global.asax's Application_Init()):

// Specific Route mapping
routes.MapPageRoute("ProductPage", "product/{sku}", "~/item.aspx");
routes.MapPageRoute("ProductPageWithQty", "product/{sku}/{qty}", "~/item.aspx");
            
// List Views Routings
routes.MapPageRoute("ProductCategory", "products/{category}", "~/itemlist_abstract.aspx");
routes.MapPageRoute("ProductWithoutCategory", "products", "~/itemlist_abstract.aspx");

This turns urls like:

item.aspx?sku=ProductID

into

product/ProductId

and

itemlist.aspx?Category=Books

into

products/Books

Note that unlike MVC, there is no direct support for optional parameters in MapPageRoute(), so each path configuration requires its own explicit route config.

So far so good. This is nice and easy to accomplish even in a WebForms application. This is an old app so I only updated a few URLs that are the most commonly externally accessed and crawled links, but it would be easy enough to do most of the application links using a similar approach.

Capturing the Route Data in WebForm

Capturing the RouteData in the routed pages is also very easy to do. Previously the code was capturing the query string, now the code captures both query string and the RouteData collection for checking the url parameters. Here's the item SKU and QTY mapping logic:

void GetSkuAndQty()
{
    Sku = Request.QueryString["Sku"];
    if (string.IsNullOrEmpty(Sku))
    {
        Sku = RouteData.Values["sku"] as string;
    }
    string Qty = Request.QueryString["qty"];
    if (string.IsNullOrEmpty(Qty))
    {
        Qty = RouteData.Values["qty"] as string;
    }

    // redirect permanently to new url
    if (Request.Url.AbsoluteUri.Contains(".aspx"))
    {                
        if (!string.IsNullOrEmpty(Sku))
        {
            string newUrl = "~/product/" +Sku + "/" + Qty;
            Response.RedirectPermanent(ResolveUrl(newUrl));
        }
    }
}

From here the code is identical to the original code, using the Sku and Qty to load the Item business object and displaying the inventory item purchase UI.

Creating Route Links Manually

In many places of the application the URLs to link to the product and category pages are generated, meaning that the links are generated as well. It seems easy enough, using code like this (for the category list):

ItemListForm = ResolveUrl("~/products");

foreach
(LineItem item in LineItems) { sb.AppendFormat(@"<div class='menurow'><a class='menulink' href='{0}/{1}'>{2}</a></div>", ItemListForm,
Uri.EscapeDataString(item.Category),
HttpUtility.HtmlEncode(item.Category)); }

Note that that I URL encode the category for the URL and HtmlEncode the category for the display text.

This produces Urls like:

products/Books and products/Development%20Tools. We're golden, right?

It works great - until it doesn't!

Yes it works great, until you use a few categories that use special formatting. This is not obvious, because the vast majority of categories work just fine - it's just a couple of specific ones that will fail.

The problem is that if you have certain names that include special characters. Specifically a . (period) or # can throw all this out. .NET and C# are good examples of where this can get hosed.

Dot me Not

So this URL is a problem:

products/.NET

Note that EscapeDataString() doesn't encode the period - as per spec that's actually correct in that . should not be urlencoded.  Even if you DO fix the period to:

products/%2ENET

it still causes problems as the value is still parsed to .NET.

Why? IIS/ASP.NET doesn't parse this URL as an extensionless URL. The period forces ASP.NET to treat the request like a page that cannot be found. Luckily there's a simple workaround for this problem by adding a trailing slash:

products/.NET/

works just fine. For generated code that is. If you generate URLs in your app it's easy enough to slap on the trailing slash. But if somebody decides to navigate to your site via a manually typed URL without the trailing slash they'll get a failure.

Don't be a #ie

The other one that has caused me pain is a # in the url, for example: C#. If you try using:

products/c#

or

products/c#/

you find that RouteData.Values["category"] returns just 'c' rather than 'c#'. The problem is that the hash character (#) has meaning in a url, namely it is meant for page level anchor jumps. More recently # has also been highjacked for history management in AJAX/SPA applications, but regardless a # in a URL is not treated as content.

In this scenario UrlEncoding DOES solve the # encoding problem, as long as you use Uri.EscapeDataString(), rather than Uri.EscapeUriString(). The latter doesn't escape # for reasons unknown.

So this URL:

products/c%23

works just fine. Again this sucks for the user who happens to type this in manually, but again, that's an edge case.

Summarizing Path Encoding

What I described here applies to any non-MVC routing scenarios, using the standard ASP.NET routing mechanisms. MVC adds a host of features on top of routing that solve these issues - mainly through ActionLink() functionality which is effectively a URL builder based on existing routes. The raw ASP.NET routing has no such construct so it's up to you to ensure that urls are properly encoded and terminated.

When you're encoding extensionless URLs extra care has to be given to properly encoding path strings as they are a bit more sensitive than query string values. Basically you're creating a path and so all the rules for URL path formatting apply, which is much more strict than what's legal in query strings. If you have many long, complex strings to pass, it's probably better to stick to query strings or POST data for that matter.

If you do use paths for route parameters remember to:

  • Always terminate your routed paths with a / to force an extensionless path
  • Don't use HttpUtility.UrlEncode() or Uri.EscapeUri()
  • Always use Uri.EscapeDataString() to encode your paths
    or else strip out or replace problem characters before encoding
    and do the same when you try to match the routes.

Related Info

Posted in ASP.NET  C#  

The Voices of Reason


 

Matt
November 16, 2013

# re: Routes, Extensionless Paths and UrlEncoding in ASP.NET

Rick,

Regarding the ".NET" path problem, does the web.config setting below offer a solution?

http://msdn.microsoft.com/en-us/library/system.web.configuration.httpruntimesection.relaxedurltofilesystemmapping(v=vs.110).aspx

Matt

Graham Mendick
November 16, 2013

# re: Routes, Extensionless Paths and UrlEncoding in ASP.NET

Great article, Url's don't get the respect they deserve. I've written a Navigation framework for Web Forms that solves these problems, https://navigation.codeplex.com/.
For example,
StateController.GetNavigationLink("Products", new NavigationData { { "category", "c#" } });

makes the link products/c%23/ And
StateController.GetNavigationLink("Products", new NavigationData { { "category", ".NET" } });

makes the link products/.NET/ (as long as you set the RouteCollection.AppendTrailingSlash to true)

Rick Strahl
November 17, 2013

# re: Routes, Extensionless Paths and UrlEncoding in ASP.NET

@Graham - nice! Thought about doing something like that a while ago but never got around to it... This looks great and should have been in the box as part of the standard ASP.NET Routing features. There's no reason the MVC features should have to be MVC specific.

Jesper
November 30, 2013

# re: Routes, Extensionless Paths and UrlEncoding in ASP.NET

A widely-used practice to make this sort of thing work without having to worry about escaping is to create a "slug" - a deformed, normalized version of the name using only a basic character set like a-z, numbers and dashes. Make a slug for every category and use the slug in the route instead of the escaped full fidelity name and having to sort out encoding will not become a problem. /products/csharp is an easier path to type, remember and guess than /products/c%23.

Since this post already has a slug itself ("Routes-Extensionless-Paths-and-UrlEncoding-in-ASPNET"), I suspect you already know about this. It's a shame it wasn't mentioned in the post though since, even though it requires changes to the data, it's a simple pattern that works with the URL instead of against it and allows you to solve this problem once and move on.

Rick Strahl
September 07, 2014

# re: Routes, Extensionless Paths and UrlEncoding in ASP.NET

@Jesper - yes, slugs are a good idea for relatively long and unique content. But if you have things like categories or other groups of short named things it's very easy to have slugs with the same name as you're either omitting or converting characters to custom characters, so that can become problematic. Slugs work great for things like page titles, but for meaningful values that need to essentially do a database lookup on and where it doesn't make sense to store the slug in the db it's not such a good idea.

Michael
April 28, 2022

# re: Routes, Extensionless Paths and UrlEncoding in ASP.NET

Hi Rick,

Thank you for your post!

The only special character I cannot find the solution for is plus (+). When I use plus character in route (or encoded value %2B) the routing does not working at all - it opens the default page.

For example:

www.sitename.com/1 - ok

www.sitename.com/1/ - ok

www.sitename.com/1/2 - ok

www.sitename.com/1+2 - routing not working

www.sitename.com/1%2B2 - routing not working

Is there any way to encode plus character to use it in the route?

Thanks,

Michael


West Wind  © Rick Strahl, West Wind Technologies, 2005 - 2024