Tuesday, October 2, 2018

Abusi^H^H^H^H^H :ahem: Customizing Netbox

What is Netbox?

Some of you might have already heard about it, but for those that haven't: there's actually an extremely useful and feature-complete, free (both as in "free beer" and as in "free speech") IPAM and DCIM solution on the market, for quite a while now. A network engineer named Jeremy Stretch - formerly of packetlife.net fame, now at DigitalOcean - has created an IPAM and DCIM platform called Netbox, which has since been officialy supported by his company, DigitalOcean.

Netbox is a fully-fledged open source IPAM tool, based on Python and Django framework, and it supports all sorts of features expected from such a tool, especially one developed by a network engineer - multi-tenancy, VRFs, geographical and DCIM entities (sites, regions, but also racks, VMs, etc) as well as a nice REST API to interact with it and integration with the NAPALM library. Even better - it is still under active development, and new features are being added regularly. So, if you haven't yet - go check it out, chances are you'll find it extremely useful for most of your "Single Source of (network) Truth" uses.

If it's all that great... Why this blog post?

Unfortunately, I have found that some of the assumptions made when Netbox data model was designed just did not fit the real world. Namely, my issue is with Netbox entity called "VRF", and since the vast majority of the networks I'm trying to model in Netbox are MPLS networks this is obviously kind of a deal breaker for me. Let's see why.

In Netbox, "VRF" model has the following fields:
  • name - a string of maximum 50 characters
  • rd - a string of maximum 21 characters, with constraint that it is globally unique (only 1 VRF with name "X" allowed)
  • tenant - a foreign key, referencing a "tenant" entity, meaning a one-to-many relationship between a "tenant" and "VRF": one "tenant" can have many "VRF"s, but one "VRF" belongs to exactly one "tenant".
  • enforce_unique - a Boolean field, describing whether to prevent duplicate IP addresses/prefixes in this VRF
  • description - a string of maximum 100 characters

The first issue I have with this model is the "tenant" FK relationship. While VRFs do in fact "belong" to "tenants" (however you decide to define a "tenant"), they are first and foremost defined on a device: the VRF represents a distinct routing instance on a device, a routing table instance, if you will, and I expect the entity to have a FK relationship to "device", not "tenant". I understand that this design decision was made in order to make VRFs represent some sort of a L3VPN entity, spanning the whole network. But VRFs are necessarily device-scoped: we cannot construct a (meaningful) network-wide entity which would represent "all the routes in the L3VPN" simply because each individual device has its own parameters for what ends up in the routing table.

The other issue I faced is the global name uniqueness requirement: if I change the model so that VRFs belong to a "device", not to a "tenant", there is no need for VRF name to be globally unique - it only needs to be unique on a particular device.

In all, the changes i wanted (well, needed) to have in Netbox are as follows:
  • remove the tenant foreign key relationship
  • add a new device foreign key relationship
  • remove the constraint that rd field is globally unique
The first place I looked for a solution was - you guessed it - Netbox Github issues page, and I was not very suprised to find people with the same needs: make VRFs device-local, and possibly implement a mechanism for tracking route-targets and other BGP entities (most notably ASNs) in Netbox. Unfortunately, this has been marked as a "Major feature", and has seen very little activity.

Could I do it myself? And help the open source community?

Now here's a minor problem: I am not a professional programmer, even though I have written a few programs in different programming languages and dabbed in software long ago. I know very little about Python, practically nothing about Django framework and even less about Github. I'm actually a network engineer. But then again, I decided - to hell with it, if I can make networks work - then tweaking a program should be at least doable. So here's my attempt to try and implement the necessary changes myself and make my work available on Github. So, this is my short journey into modifying an existing open-source Django application.

Django crash course

How does it work now?

First thing I needed to sort out was: how does Netbox actually work and where do I make the changes I need? I started digging for answers.

Without going into (too) many details, Django is a framework written in Python, which follows the MVT (model-view-template) pattern. Therein lies the answer to how it works. Let's break it down real quick:
  • This particular application (Netbox) is separated into several so-called Django apps (ipam, dcim, tenancy, circuits, etc). Django uses apps to segregate the code into logical units and to manage namespaces
  • The "Model" part in the MVT acronym is actually all the pieces of Python code which describe the entities in the application and their relationship to other entities. It also handles CRUD (Create, Read, Update and Delete) operations in the database, so you don't have to write SQL queries and manage database connections, but can operate on Python objects instead.
  • The "View" part is responsible for handling HTTP requests and the "business logic" of the application - it is the code which gets executed when you navigate to a specific page of the app
  • The "Template" part is code which generates the HTML which is to be served to the user when he navigates to a page. The "View" component will take the business logic, extract/add/update/delete data from the database using the "Model" component, select the appropriate "Template" and render the final HTML to the user.
Furthermore, the "View" component - being that it encompasses all of the apps business logic, in Netbox also relies on several additional components which needed to be updated as well: "Forms", "Tables" and "Filters". Forms map the HTML forms which are rendered via View/Template combination to the underlying data - they map the fields on the form to the entities they describe and from which they should be populated. Tables are exactly what it says on the box: description and "glue" logic for tabular display of entities as well as mapping the table columns to underlying model. Filters allow for easy searching/filtering, both when accessing via browser or via using the REST API.

So... Where and how do I change stuff?

Umm, short answer - everywhere. But, the first obvious place to start was the "Model" section, so I started with that.

From StackOverflow and Django documentation, it was obvious that the general steps I should take are:
  1. Edit the appropriate "models.py" file (in my particular case it was ./netbox/ipam/models.py) so that it corresponds to the VRF object I am trying to model - change the foreign key it came with (to table Tenants) to a foreign key referring to table "Devices", and also remove the uniqueness constraint on "rd" field
  2. Run the command "python3 manage.py makemigrations" which will generate so-called "migrations" files, specifying how the model changes.
  3. If all goes well - run "python3 manage.py migrate" to apply the model changes to the backend database.
EZPZ-lemon-squezee right? Well, not quite.

What I ended up changing

Changing the "models.py" file was easy enough - the syntax is extremely descriptive and it was obvious what needed to be changed - replace the "tenant" foreign key with "device" foreign key, with practically identical implementation. But running the makemigrations resulted in a bevvy of Python errors. It was obvious that makemigrations checks not only the models, but also other parts of the application.

Following the stack traces and error messages from multiple makemigrations runs and additional testing, I've found several other things that need to be changed:
  • ./netbox/ipam/filters.py - change the VRFFilter class so that it does not do "tenant" search, since we've removed the "tenant" entity as a FK
  • ./netbox/ipam/forms.py - all VRF-related classes, so that they don't reference the changed "tenant" FK, but "device", and also allow for easier adding/editing existing VRFs, by displaying device filtering.
  • ./netbox/ipam/views.py - same as before: we're eliminating all traces of "tenant" FK and replacing it with either "device" or even a chained stuff (since "device" does have a "tenant" FK)
  •  ./netbox/ipam/api/views.py and ./netbox/ipam/api/serializers.py - in order for API calls to work with the new model. View needed to have its "tenant" references removed, while the serializer needed to be updated to use (already existing) NestedDeviceSerializer instead of referencing the "tenant".
  • The templates needed changing:
    • ./netbox/templates/ipam/vrf.html and vrf_edit.html - to reflect the new object fields
    • ./netbox/templates/tenancy/tenant.html - because even though nothing about "Tenant" entity was changed - the current HTML template included counting the number of VRFs belonging to each tenant, so that stopped working when the FK was removed.
After all them changes, we had a successful run of "makemigrations" followed by a successful "migrate". Yay, success!

So, what's next?

Well, for starters - I need to continue testing the current state of the app. The Github TravisCI shows good builds, and the app has passed sanity checks, but I'm still paranoid and testing it just to be sure I haven't missed anything that (still) needs changing. (You can also help if you'd like - my Netbox fork is available on my Github, and I'd greatly appreciate any testing feedback).

The next thing, as discussed on the Netbox issues page, will be the route-targets. These need to be in Netbox if it is to be a single source of (network) truth, and a potential source of data for all sorts of network automation, so as a next step in honing my Python and Django skills I'll be working on adding "import" and "export" route targets as additional fields in Netbox.

Apart from that, I would like to implement a way to store information about BGP ASNs in Netbox - some networks have more than one BGP domain and having that information in a central repository is necessary.

So - stay tuned, new posts will cover my adventures in adding new functionalities to an existing Django application.



2 comments: