A discussion at Cisco Live this year got me thinking about network emulation and digital twins.
About six months ago, I started building Skyforge, an internal network emulation platform at Forward. The project started as an attempt to solve a practical problem: how do you quickly create realistic multi-vendor environments for demos, testing, and experimentation without maintaining racks of physical hardware?
The name comes from The Elder Scrolls V: Skyrim, which is probably what happens when you spend too many hours playing video games and then find yourself naming infrastructure projects.
Skyforge uses Kubernetes to orchestrate and manage emulated network environments, along with netlab to generate topologies, configurations, and deployment artifacts across multiple vendors. None of those technologies are particularly new, and individually they're all excellent at what they do.
What surprised me wasn't getting the individual pieces working. What surprised me was how quickly the problem changed from "how do I deploy a topology?" to "how closely does this topology represent reality?"
Before going further, it's worth defining a few terms because the networking industry tends to use them differently.
When I refer to network emulation, I'm talking about environments that run virtual network operating systems and attempt to reproduce the behavior of physical networks. Examples include Cisco Modeling Labs (CML), EVE-NG, GNS3, and similar platforms.
When I refer to a digital twin, I'm referring to a representation of a production environment that is derived from the actual network and remains synchronized as that network changes over time.
Those definitions matter because while building Skyforge, I found myself repeatedly running into the boundary between those two concepts.
Going into the project, I assumed the difficult parts would be infrastructure problems. Running large numbers of devices, scheduling resources, handling multiple users, and keeping everything operational seemed like the obvious challenges. Some of those challenges certainly exist, but they weren't what consumed most of the engineering effort.
Most of the hard problems came from the network operating systems themselves and the reality that emulating a network is very different from accurately representing one.
Building Skyforge
One of the first things I learned was that topology generation itself is largely a solved problem. Given a topology definition, netlab can automatically generate configurations, addressing plans, and deployment artifacts for a wide range of vendors and platforms. Once the workflow is established, generating a 50-node topology isn't dramatically different from generating a 5-node topology.
The work starts after deployment. More specifically, the work starts when you need to determine whether the environment is actually usable. Did every device boot correctly? Did the management plane come up? Did routing protocols converge? Can you determine those things consistently across multiple vendors and operating systems? Can you expose that information in a way that another engineer can consume without having to manually troubleshoot the environment every time something goes wrong?
Those questions ended up consuming significantly more engineering effort than topology generation itself.
One thing I underestimated was how quickly the deployment pipeline became more complicated than the topology. Generating YAML is easy. Generating configurations is easy. Getting ten different network operating systems to consistently reach a usable operational state is where the real work starts.
Different platforms expose readiness in different ways. Some provide useful APIs. Some require operational checks. Some are technically booted long before they're actually ready to establish routing adjacencies or forward traffic. Some platforms behave consistently across versions, while others require changes to deployment logic as images evolve.
Supporting a new NOS was almost never a one-time effort. Getting the image deployed was usually the easy part. The harder work involved figuring out how to determine whether it was healthy, whether those checks were reliable, and whether they would continue to work after the next image update.
In several cases, a device would be technically booted and reachable long before it was actually ready to establish routing adjacencies or forward traffic. The deployment pipeline ended up needing platform-specific readiness checks simply to make environment creation predictable. As the number of supported platforms increased, so did the amount of platform-specific logic required to keep deployments reliable.
What started as a topology problem gradually became a software engineering problem.
The Reality of Multi-Vendor Emulation
Another assumption I had going into the project was that every major network platform would have a reasonably usable virtual equivalent.
That assumption didn't survive very long.
Virtual images exist across much of the industry, but availability, licensing, feature support, stability, and operational maturity vary considerably between vendors and even between product families from the same vendor. Some platforms have excellent virtual implementations that behave predictably and integrate cleanly into automation workflows. Others require considerably more effort. In some cases obtaining the image is the difficult part. In others, the image exists but has limitations that become obvious once you try to use it outside of a certification lab or proof-of-concept environment.
This isn't really a criticism of vendors. Many network operating systems were designed to run on specific hardware platforms and were never intended to become portable software artifacts. The challenge is that once you're trying to build realistic multi-vendor environments, image availability becomes a design constraint. Before you can emulate a network, you need something to emulate.
I also underestimated how often image availability would influence design decisions. Not every platform has a virtual equivalent, and not every virtual image supports the same features as its physical counterpart. In a few cases, the limiting factor wasn't compute resources or topology complexity—it was simply whether a usable image existed at all.
The other realization was that supporting a new NOS almost always creates more work than expected. Every platform brings its own startup behavior, management interfaces, operational assumptions, and edge cases. The topology may be identical, but the amount of engineering required to make deployments predictable can vary dramatically depending on the platform involved.
Where Emulation Starts Breaking Down
In general, I found that traditional Layer 3 technologies translated reasonably well into virtual environments. OSPF, BGP, route redistribution, policy validation, and most common routing workflows rarely became major sources of friction. If the goal was validating routing behavior, testing automation, or reproducing a control-plane issue, emulation worked extremely well.
The challenges started appearing as I moved deeper into Layer 2 and modern data center networking.
VXLAN, EVPN, MLAG, vPC, multicast forwarding, MAC learning behavior, and various flood-and-learn mechanisms all introduced additional complexity. This isn't a criticism of any particular vendor or platform. Many of these technologies were designed around assumptions involving hardware forwarding behavior, timing characteristics, ASIC implementations, and scale considerations that are inherently difficult to reproduce in software.
The most interesting challenges tended to appear around those technologies. Layer 3 routing generally behaved as expected, but features such as EVPN, VXLAN, MLAG, and multicast often required much more careful validation. The question wasn't whether a feature could be configured; it was whether the resulting behavior was representative enough to support the conclusion being drawn.
In many cases the feature technically exists. The more difficult question is whether the behavior is representative enough to support the conclusion you're trying to draw. For some use cases, the answer is absolutely yes. For others, the answer becomes "it depends."
That ended up being one of the recurring themes throughout the project. The question wasn't whether something could be emulated. The question was how much confidence you could place in the result.
Whenever network emulation platforms are discussed, somebody eventually asks how many nodes they can support. It's a reasonable question, but after spending six months working on Skyforge, I don't think it's the most important one.
The harder question is whether the environment behaves consistently. Can the same topology deploy successfully every time? Can multiple users run workloads simultaneously without affecting each other? Can failures be detected automatically? Can the platform explain why something failed? Can it recover cleanly when something goes wrong?
Those are the questions that determine whether an environment is useful outside of a personal lab. The challenge stops being "how many devices can I run?" and becomes "how much engineering effort is required to keep the environment trustworthy?" Those are very different problems.
The other realization was that the environment is never really finished. Network operating systems evolve. Images change. Licensing changes. Features appear and disappear. Vendors release new platforms. Existing platforms gain capabilities or introduce new limitations.
The engineering effort doesn't stop once the environment works. In many ways, that's when the long-term effort begins.
One thing I didn't appreciate at the beginning was that every new NOS effectively becomes something that needs to be maintained. New images need testing. Existing deployment logic needs validation. Readiness checks need updating. Supporting ten platforms isn't ten times harder than supporting one, but it's closer to that than most people expect.
A personal lab can tolerate a surprising amount of drift. A shared environment cannot. Once other people begin depending on the platform, every upgrade, image refresh, and topology change becomes part of an ongoing maintenance cycle.
This was probably the point where I stopped thinking about Skyforge as a lab and started thinking about it as a product.
What This Taught Me About Digital Twins
One of the reasons the digital twin discussions at Cisco Live caught my attention is that building Skyforge forced me to think carefully about what emulation is actually good at.
Network emulation is an incredibly valuable tool. It's one of the best ways to learn a platform, validate automation, test a design, or reproduce a problem in a controlled environment. I use it constantly and would have a hard time imagining modern network engineering without it.
At the same time, building an emulation platform reinforced something I hadn't fully appreciated before I started.
An emulated environment is ultimately a constructed representation of reality. Somebody has to decide which devices exist, how they're connected, what configurations get deployed, and what details are important enough to model. That's perfectly reasonable when the goal is to validate a design or answer a specific engineering question, but it becomes significantly more challenging as the environment grows and changes over time.
For testing a design, validating automation, or answering a specific question, that's often exactly what you want. The environment only needs to be detailed enough to answer the question being asked.
Representing a large production environment that changes continuously is a different challenge entirely.
After spending six months building Skyforge, I've come away with a greater appreciation for both the power and the limitations of emulation. The biggest challenge wasn't getting virtual routers to boot. The biggest challenge was understanding where the virtual environment accurately represented reality, where it simplified reality, and where it diverged from reality altogether.
That's the part that doesn't show up on product datasheets, and it's the part I found myself thinking about most often.


