[Infrastructures] using IA methodologies to build network
element configuration
Daniel Hagerty
Daniel Hagerty <hag@linnaean.org>
Sun, 3 Apr 2005 06:40:28 -0400
> Exactly. In many cases, you aren't actually "duplicating" info onto
> different network devices, but rather deriving specific info for each
> device from some common source.
The difference in our wording is simply a semantic one; we appear
to speak of the same things, but word them differently. Such is
communication...
I often think of it as something akin to copy & paste coding
styles: rather than factor out the common sub problem in your two
nominally independant problems, slash up the two problems in an ad-hoc
fashion. This "works", but is very frail over time -- you suddenly
have to apply fixes to both versions, and error frequently creeps in.
<sarcasm>I'm sure you've never seen inconsistency drift into a system
managed mostly by humans managing discrete elements of this
system.</sarcasm>
However, in this case, we have no choice: between "Simple Matter
of Implementation" and issues like the need for the "real"
configuration information to be close at hand (like in an nvram), this
"copy & paste" is what you must do.
> I've been toying with the idea of creating a system (database,
> language, etc.) for describing networks in enough detail that you
"Perfectly doable", as we've seen.
Let me express a much simpler version of the same problem,
different context (no pun intended). I'm going to express it a very,
very particular way, on purpose.
Consider the two unix programs inetd and xinetd. They are
different programs. They have different configuration files. And yet
they do roughly the same thing; from far enough outside a system, you
can't tell if it's one or the other implementing the functionality of
the platonic "Internet Super Server" program.
It would obviously be useful to be able to specify our
configuration "Internet Super Server" configuration in some uber
language, and/or convert between the two configuration files.
Complicating the matter slightly are details like, if I consider
things like NetBSD's inetd vs the xinetd of some random redhat
release, they are not "strictly comparable" if you will -- each is
able to express concepts that the other can not, but there is a great
deal of overlap.
An approach is to start by writing parsers and unparsers for each
of the configuration files involved. In reality, the parser/unparser
has some time rot issues, but I'll mostly ignore those for now.
Handling the evolution of software over time is substantially harder
than a point in time, depending on how "non-commodity" the program's
"domain" or "area of application" is. The more specialized the
software, and the higher the delta rate of the software over time, the
more likely you need to deal with higher and higher order techinique
to represent it and all its delta. So I'll Keep It Simple Stupid by
saying that time is a simple non-problem (hah!).
There's no need to squish the parsers output forms into the same
output syntax. I do this on purpose, if only because it makes the
unparsers job easier. If the parse and unparse operations are nothing
more than string -> abstract syntax tree and the inverse, with an
abstract syntax tree that closely tracks the syntax and grammar, we're
one step closer to handling time rot correctly.
Next, write two more parsers/unparsers: these map between the
abstract syntax tree, and the platonic "Internet Super Server"
domain. This domain is a strict superset of the two subdomains of
inetd and xinetd and has the concepts of all of them.
A detail to pay attention to in the unparsers is the attempt to
reify an unsupported concept into the language we're unparsing into.
Concretely, xinetd has many features that inetd does not. Given an
instance of our platonic internet super server language that uses
concepts exclusive to xinetd, we can't unparse these into the language
of inetd.
Finally, we need one last parser/unparser. We've made it possible
to create the abstraction of our platonic language from inetd.conf and
xinetd.conf, but we don't have any way to create it directly. Enter
the last parser/unpaser. The parser takes our string representation,
and creates the internal representation. The unparser does the
opposite, creating a string representation from the internal.
An important detail to note in here is lost information: for
example, consider comments. If our various parsers toss them away as
semantically irrelevant, then our unparsers obviously can't recreate
them. There is a more general property in here -- what we're really
trying to do is compute "denotation" when all is said and done. If
two statements mean the same thing, then they *are* the same thing: we
can substitute one for the other and never notice the difference.
These parts can be wired together in a number of obvious ways to
produce the tools we want. There's a particular wiring that's more
magical than others, but I'm not quite there yet for explaining it.
I've got a wiring that's very close to the "magically correct", but
not quite. Still thinking.
Your problem is of larger scope than this toy example. However,
it covers some of the issues. Depending on how much you want to do,
there is more that you must deal with.
What degree of semantic validation do you want to be able to
express? Having all of the information you want in one place obvious
makes computing consistency questions easier. I'm having trouble of
the top of my head thinking of good examples of validation examples,
but I'm sure there are good ones.
There's the "easy" problem of generating all the configurations
for a point in time. This probably doesn't cover your problem.
You probably have the harder problem where you want to take a
current state, a new state, and compute "the delta statement". The
delta statement is an expression in some language that actually
mutates the real world system between its current state and the new
state. A non toy version of the tool that does takes on complex
issues quickly.
Consider a simple network consisting of two cisco 2600s and two
hosts. Each host is connected to one of the 2600s by way of ethernet.
The 2600s are connected to one another by way of a t1 routing ipv4,
numbered out of some random /30. The hosts can only communicate with
one another by way of the 2600s.
We can write a description in some hypothetical language of this
system. We can also write a description in this same language in
which we've renumbered the t1 shared by the 2600s into some other
address space.
In order to approach the problem of the delta statement, we need
to provide further qualification. How are we performing the
reconfiguration? Suppose we're doing it by being on one of the hosts
in some fashion, and logging into the 2600s by way of ssh. We emit
the statements we need to mutate from one state to another, one at a
time, very carefully.
As Steve will tell us, order really does matter in doing something
like this. If we're on a particular host, we have to login to the
topologically far router first, reconfigure its network in a very
careful order, and usually type one last key statement that destroys
our ability to communicate with the router until we reconfigure the
topologically near router to agree with the state of the far router.
No amount of reconfiguring the near end router first will ever allow
us to converge on the desired new state.
So, as demonstrated, the delta statement is context sensitive.
Where you are in the world influences what the delta statement is.
As the model you wish to work in gets bigger, the delta statement
obviously takes on more and more hair. "Go here. Twiddle these
knobs. Ship a human plus pre-configured M5 to flubox, have him argue
with the telco to deliver a circuit from flubbox to 60 hudson, plug
m5, electricity, circuit, customers together. Twiddle config on
router new circuit lands on. Twiddle monitoring, etc, etc, etc".
The tool we really want has suspicious resemblence to what a
philospher may call a "Top Level Ontology". "Ontology" is not my
favorite word, especially when we get to "Top Level Ontology", but
there it is. We want to be able to have a symbolic reasoning program
that knows nothing of our languages; we explain "what our languages
mean" to it, and it helps us reason with our languages without
"knowing what the languages mean", because it's doing all the
reasoning at a meta level where everything is just symbol
substitution. Mathematical logic works with the form of statements,
not their matter. We want the same, but in a very high order realm.