URLs, URIs, URNs and their Structure

Random

So in performing an emergency workaround for a zero-day on a Microsoft Exchange server, I came across a term: URI. It’s something I’ve met before, but then realised, what actually is a URI?

So I set out to find out and like many things went down a rabbit hole!

In this article I’ll try to describe what a URL, URI and URN is, as its not always immediately apparent; I’ll also show an overview of the key structure of a URL, so you have the names/terminology you need to describe the constituent parts.

But first a bit of a warning, the RFC 3986, can be considered to be a little ambiguous on these terms and their exact specification, so my conclusions below might not fully align to what your understanding is, that’s okay, just comment!

The Superset and the Subset

A URI (Uniform Resource Identifier) is a string that can be used to uniquely identify a resource on the Internet. URIs are the superset which have two subsets the URL and URN.

A URL (Uniform Resource Locator) is a kind of URI that specifies the unique resource location, but also includes the protocol needed to reach that resource. For example https://, http:// or ftp://

A URN (Uniform Resource Name) is a type of URI that uses a naming scheme of urn:, the URN wikipedia article gives some good examples, but the key thing to remember they are intended to be globally unique and persistent within a defined namespace. You’ll see URNs referenced as the “resource” or object you wish to access.

URI/URL Structure

The structure of the URI/URL is as follows, you can see how the component parts fit together, and depending on which bit you are referring to you can use the more specific name.

So which should you use?

There are no hard and fast rules and it also depends on the context.

Well if you are telling someone how to get to your webpage/website you should refer to the URL, that gives them everything they need to know to access the resource which includes the protocol. Even through strictly speaking a URI is the superset of a URL, I know it is confusing!

A URI you might refer to on a web server when setting how your webserver might redirect or manipulate the user’s request string for example; the assumption being a request has already got to your server (using the URL), so you don’t need to include the protocol, hence you’re working with the URI.

Anatomy of the URL

So what are all the parts of a URL called and what are they for. Well let’s examine the following example URL, this includes lots of parts to demonstrate all the bits you might find within different URLs.

Let’s examine each component part in turn:

  1. Scheme – The scheme is essentially the protocol for the URL, e.g. https://.
  2. Authority – The host/server you are trying to reach, the authority can be made up of three parts, you’ll normally only use one though, the Host.
    1. Username – If you’re passing your username to the web server within the URL, a bit old fashioned now, but you can do it with the syntax username@.
    2. Host – The Host is the FQDN (Fully Qualified Domain Name), i.e. the server name of the website you are trying to reach.
    3. Port – Normally when you specify the protocol (scheme) of http:// you mean TCP port 80, if you specify https:// you mean TCP port 443, and your browser will assume this so you don’t need to enter it. But you can if your server listens on a non-standard port add a :<port number> e.g. :8080 to the end of the authority string to specify which port you want to connect to.
  3. Path – The path to the resource you want to access, this might just be a file, like index.html or it might be a path and a filename e.g. /pages/index.html, or if you are referring to a query it may just be a path.
  4. Query – If your website or web application allows queries to be passed within the URL, here is where they are passed, the query string starts with a question mark.
  5. Fragment – The fragment, which starts with a hashtag refers to a location within the resource. For example if index.html is a very long page, referring to

Conclusion

So hopefully that has helped give you an overview of what URL, URI and URNs are and how these fit within the structure of a URL. The secret life of something you use everyday and take for granted, and the effort that went into putting together something that is has adapted to the changes over the past 30 odd years of the Internet.

Image Attribution

Leave a Reply

Your email address will not be published. Required fields are marked *