Skip to content

Net testbed

Marek Franciszkiewicz edited this page Jul 19, 2021 · 1 revision

Network testbed

The primary goal of the network testbed is to inspect and verify correctness of the network protocol and the implemented features of the networking library. In order to achieve this goal, the testbed tooling should be capable of:

  • deploying and tearing down a network of nodes
  • inspecting node's state
  • inspecting node's connection map
  • observing connection / state changes of each node
  • calling network library API methods on individual nodes

With this foundation, a dynamic connection graph can be built. Each connection and vertex state change can be saved and displayed in real time (e.g. transfer rate histograms).

Each atomic change in the network is persisted. This way, network state can be re-created and inspected at any point in time, at a later time.

Overview

The testbed is composed of the Orchestrator application, orchestrator drivers for deploying and communicating with individual nodes, and extensions to the newly-developed network library.

localhost
┌───────────────────────────────────────────────────────────────┐
│ Orchestrator                          qemu                    │
│ ┌───────────────────────────────┐     ┌─────────────────────┐ │
│ │ API             driver        │     │                     │ │
│ │ ┌────────────┐ ┌────────────┐ │     │ ┌──┐ ┌──┐ ┌──┐ ┌──┐ │ │
│ │ │            │ │            │ │     │ │VM│ │VM│ │VM│ │..│ │ │
│ │ │ ┌────────┐ │ │ ┌────────┐ │ │     │ └──┘ └──┘ └──┘ └──┘ │ │
│ │ │ │   WS   │ │ │ │  qemu  │◄├─┼────►│                     │ │
│ │ │ └────────┘ │ │ └────────┘ │ │     └─────────────────────┘ │
│ │ │            │ │            │ │                             │
│ │ │            │ │            │ │     docker-composee         │
│ │ │ ┌────────┐ │ │ ┌────────┐ │ │     ┌─────────────────────┐ │
│ │ │ │  REST  │ │ │ │compose │◄├─┼────►│         ECS         │◄├──► ECS
│ │ │ └────────┘ │ │ └────────┘ │ │     └─────────────────────┘ │
│ │ │            │ │            │ │                             │
│ │ └────────────┘ └────────────┘ │                             │
│ │       ▲                       │     ┌─────────────────────┐ │
│ │       └───────────────────────┼────►│    web dashboard    │ │
│ │                               │     └─────────────────────┘ │
│ └───────────────────────────────┘                             │
│                                                               │
└───────────────────────────────────────────────────────────────┘

Orchestrator

The Orchestrator should support both large(-ish) scale testing (multiple machines) and feature testing on a single machine via a single interface, backed with interchangeable drivers.

The interface of the application is a hybrid RESTful JSON / WebSocket API (TBD: WS API only) designated to:

  • deploying nodes

    Via this endpoint, users choose the underlying driver, provide network configuration and driver settings. Then, the driver-specific deployment process is executed.

    The Orchestrator starts the collection of node and connection state via WebSocket endpoints exposed by each individual node.

  • tearing down the network

  • inspecting node state

    Serves the stored node state data to the user.

  • inspecting connection state

    Serves the stored connection state data to the user.

  • calling exposed API endpoints on nodes

    Can be executed individually or in bulk. Used for establishing connections and sending network messages.

  • re-publishing node events

    Multiplexes events published by nodes. Supports filtering by event type.

Furthermore, a dashboard application can be built upon the API, supporting the following features:

  • initial network configuration, deployment and tear down
  • network graph view, updated in real-time
  • node list view, incl. their state, updated in real-time
  • graph edge and vertex state inspection, updated in real-time
  • establishing connections between nodes manually or using a input configuration file (JSON)
  • executing the testing features of the network library (e.g. initiating gftp transfers) individually or in bulk
  • log tailing and collection

Drivers

docker-compose + Amazon ECS

Suitable for large(-ish) scale testing.

Utilizes a pre-published Alpine Linux docker image w/ SSH daemon and common network and system utilities.

Deployment:

  • create a docker-compose YAML file
    • parent network
    • multiple child networks if NAT is enabled
    • server node
    • client nodes
  • deploy w/ docker-compose using the Amazon Elastic Container Service driver

qemu + iptables

Linux only. Suitable for single machine feature testing of:

  • various NAT traversal techniques
  • correctness and performance of developed protocols

Utilizes a pre-built Alpine Linux VM image w/ SSH daemon and common network and system utilities.

Deployment:

  • setup extra network iface for each node
  • setup a network bridge between the extra ifaces
  • setup iface iptables configuration, depending on NAT type
  • spawn qemu:
    • configure 2 network interfaces:
      • extra iface as parent, fixed IP
      • host net iface as parent, fixed local net IP (for API communication)
    • mount working directories on local drive for log / state storage

Network library extensions

The network library should feature building Client and Server binaries, which expose (a RESTful JSON +) WebSocket API for communication, external control and inspection.

Client

  1. features/bin (requires features/api-rest)

    Executable build toggle. Starts the HTTP API server.

  2. features/api-rest

    JSON API and WebSocket endpoints for the library API, including configuration. Message handler engine is shared between both kinds of endpoints.

    Inspection endpoints are enabled with features/inspect.

    Testing endpoints are enabled with features/testing.

  3. features/inspect (optional)

    If enabled, the library exposes an introspection API and gathers the following statistics and summaries (in-memory only):

    • state
      • time running
      • environment details
      • configuration (if any)
    • server connection details
      • connection state (+ time elapsed since)
        • connected
        • connecting
        • disconnected
        • disconnecting
      • last failure time
      • peak, avg, cur transfer rate
      • number of re-connections
      • number of protocol errors (inbound)
      • number of protocol errors (outbound)
    • established connections
      • protocol
      • mode
        • direct
        • relay + "server" address
        • nat + traversal method
      • source and destination addresses
      • connection time
    • established connection details
      • protocol
      • mode
      • source and destination addresses
      • connection time
      • last failure time
      • peak, avg, cur transfer rate
      • number of messages sent
      • number of re-connections (via node id)
      • number of protocol errors (inbound)
      • number of protocol errors (outbound)
    • log collection (opt. from a certain point in time)
  4. features/testing

    testing module is responsible for "running" existing protocols (e.g. gftp) and / or generating deterministic network traffic between nodes.

Event propagation:

  • server connection state change
  • node connection state change (a single entry of "established connections")

Server

  1. features/bin

    CLI-configurable address and port.

  2. features/api-rest (requires features/inspect)

    JSON API and WebSocket endpoints for the library API (incl. configuration. Inspection endpoints are enabled along with features/inspect.

    Event propagation:

    • node connection state change (a single entry of "established connections")
  3. features/inspect (optional)

    In a similar fashion to the client:

    • state
      • peak, avg, cur transfer rate
    • established connections
    • established connection details
    • log collection