Building URIs

Constructing URLs often seems simple. There are some problems with concatenating strings to build a URL:

  • Certain parts of the URL disallow certain characters
  • Formatting some parts of the URL is tricky and doing it manually isn’t fun

To make the experience better rfc3986 provides the URIBuilder class to generate valid URIReference instances. The URIBuilder class will handle ensuring that each component is normalized and safe for real world use.

Example Usage

Note

All of the methods on a URIBuilder are chainable (except finalize() and geturl() as neither returns a URIBuilder).

Building From Scratch

Let’s build a basic URL with just a scheme and host. First we create an instance of URIBuilder. Then we call add_scheme() and add_host() with the scheme and host we want to include in the URL. Then we convert our builder object into a URIReference and call unsplit().

>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
...     'https'
... ).add_host(
...     'github.com'
... ).finalize().unsplit())
https://github.com

Replacing Components of a URI

It is possible to update an existing URI by constructing a builder from an instance of URIReference or a textual representation:

>>> from rfc3986 import builder
>>> print(builder.URIBuilder.from_uri("http://github.com").add_scheme(
...     'https'
... ).finalize().unsplit())
https://github.com

The Builder is Immutable

Each time you invoke a method, you get a new instance of a URIBuilder class so you can build several different URLs from one base instance.

>>> from rfc3986 import builder
>>> github_builder = builder.URIBuilder().add_scheme(
...     'https'
... ).add_host(
...     'api.github.com'
... )
>>> print(github_builder.add_path(
...     '/users/sigmavirus24'
... ).finalize().unsplit())
https://api.github.com/users/sigmavirus24
>>> print(github_builder.add_path(
...     '/repos/sigmavirus24/rfc3986'
... ).finalize().unsplit())
https://api.github.com/repos/sigmavirus24/rfc3986

Convenient Path Management

Because our builder is immutable, one could use the URIBuilder class to build a class to make HTTP Requests that used the provided path to extend the original one.

>>> from rfc3986 import builder
>>> github_builder = builder.URIBuilder().add_scheme(
...     'https'
... ).add_host(
...     'api.github.com'
... ).add_path(
...     '/users'
... )
>>> print(github_builder.extend_path("sigmavirus24").geturl())
https://api.github.com/users/sigmavirus24
>>> print(github_builder.extend_path("lukasa").geturl())
https://api.github.com/users/lukasa

Convenient Credential Handling

rfc3986 makes adding authentication credentials convenient. It takes care of making the credentials URL safe. There are some characters someone might want to include in a URL that are not safe for the authority component of a URL.

>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
...     'https'
... ).add_host(
...     'api.github.com'
... ).add_credentials(
...     username='us3r',
...     password='p@ssw0rd',
... ).finalize().unsplit())
https://us3r:p%40ssw0rd@api.github.com

Managing Query String Parameters

Further, rfc3986 attempts to simplify the process of adding query parameters to a URL. For example, if we were using Elasticsearch, we might do something like:

>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
...     'https'
... ).add_host(
...     'search.example.com'
... ).add_path(
...     '_search'
... ).add_query_from(
...     [('q', 'repo:sigmavirus24/rfc3986'), ('sort', 'created_at:asc')]
... ).finalize().unsplit())
https://search.example.com/_search?q=repo%3Asigmavirus24%2Frfc3986&sort=created_at%3Aasc

If one also had an existing URL with query string that we merely wanted to append to, we can also do that with rfc3986.

>>> from rfc3986 import builder
>>> print(builder.URIBuilder().from_uri(
...    'https://search.example.com/_search?q=repo%3Asigmavirus24%2Frfc3986'
... ).extend_query_with(
...     [('sort', 'created_at:asc')]
... ).finalize().unsplit())
https://search.example.com/_search?q=repo%3Asigmavirus24%2Frfc3986&sort=created_at%3Aasc

Adding Fragments

Finally, we provide a way to add a fragment to a URL. Let’s build up a URL to view the section of the RFC that refers to fragments:

>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
...     'https'
... ).add_host(
...     'tools.ietf.org'
... ).add_path(
...     '/html/rfc3986'
... ).add_fragment(
...     'section-3.5'
... ).finalize().unsplit())
https://tools.ietf.org/html/rfc3986#section-3.5