The Open Source IRATI Prototype: design

Transcription

The Open Source IRATI Prototype: design
The Open Source IRATI Prototype
design, implementation and future plans
28th January 2015 - Ghent
Francesco Salvestrini
Nextworks s.r.l.
Implementing RINA, previous prototypes …
• Pre-2013, few RINA prototypes have been
implemented:
– ProtoRINA (https://github.com/ProtoRINA/users/wiki)
– Alba (closed source)
• (Design - and implementation - vary depending on
the goals to accomplish)
• Pre-IRATI prototypes:
1. Focus on the validation of the architecture
2. Written in Java → residing in user-space
2
… previous RINA prototypes …
• Focus on concepts, not performance
• Constrained to the limitations imposed by the OS:
–
e.g. inherit limitations of both the TCP/IP stack and the (POSIX) sockets API
System (Host)
Application Specific Tasks
System
(Router)
Appl.
Process
Other Mgt. Tasks
IPC Mgt. Tasks
Multiplexing
IPC
Resource
Mgt.
SDU
Protection
Inter DIF
Directory
Appl.
Process
Mgmt
Agemt
System
(Host)
DIF
Mgmt
Agemt
IPC
Process
Shim IPC
Process
IPC
Process
IPC
Process
Shim DIF
over TCP/UDP
Shim IPC
Process
Shim IPC
Process
Shim DIF
over Ethernet
Mgmt
Agemt
Shim IPC
Process
IPC API
Data Transfer
Layer Management
Data Transfer Control
CACEP
SDU Delimiting
Legacy
Net. stack
DataTransfer
Transfer
Data
Data
Transfer
Relaying and
Multiplexing
SDU Protection
Kernel
User
State Vector
State Vector
State Vector
NICs
Retransmission
Retransmission
Retransmission
Control
Control
Control
RIB
Daemon
Flow Allocation
Authentication
Resource Allocation
CDAP
Parser/Generator
Flow Control
Flow Control
Flow Control
Routing
RIB
Enrollment
Namespace
Management
Security
Management
3
… what was needed next?
IPC
Process
System (Host)
IPC API
Data Transfer
Layer Management
Data Transfer Control
CACEP
SDU Delimiting
Relaying and
Multiplexing
SDU Protection
State Vector
State
StateVector
Vector
DataTransfer
Transfer
Data
Data
Transfer
Retransmission
Retransmission
Retransmission
Control
Control
Control
RIB
Daemon
Authentication
Resource Allocation
CDAP
Parser/Generator
Flow Control
Flow Control
Flow Control
RIB
Flow Allocation
Routing
Enrollment
Namespace
Management
Security
Management
Increasing timescale (functions performed less often)
• Start thinking about performances
• Allow RINA to lay on all the devices OSes support nowadays
• Move to a more mature prototype
4
Where did we start …
•
We decided to implement (part of) the IPC Process functionalities in kernelspace …
System (Host)
System
(Router)
Appl.
Process
IPC
Process
Mgmt
Agemt
Shim IPC
Process
DIF
IPC
Process
Shim DIF
over TCP/UDP
Shim IPC
Process
Shim IPC
Process
Appl.
Process
Mgmt
Agemt
System
(Host)
IPC
Process
Shim DIF
over Ethernet
Mgmt
Agemt
Shim IPC
Process
IPC API
Data Transfer
SDU Delimiting
Relaying and
Multiplexing
State
State
State
Vector
Vector
Vector
DataTransfer
Transfer
Data
Data
Transfer
Layer Management
Data Transfer Control
Retransmission
Retransmission
Retransmission
Control
Control
Control
CACEP
RIB
Daemon
Flow Control
Flow Control
Flow Control
SDU Protection
RIB
Authentication
CDAP
Parser/Generator
Flow Allocation
Resource
Allocation
Enrollment
Routing
Namespace
Management
Security
Management
•
Which ones ? How do we split them ? How do they communicate ? How
can we increase performances …
•
… various possibilities …
5
What goes where?
•
We placed SW components in different “paths”, depending on their timing
requirements…
–
–
Data transfer → stringent timings → kernel-space
Layer Management → loose timings → user-space
System (Host)
System
(Router)
Appl.
Process
Mgmt
Agemt
IPC
Process
DIF
IPC
Process
Shim DIF
over TCP/UDP
Shim IPC
Process
Shim IPC
Process
Shim IPC
Process
Appl.
Process
Mgmt
Agemt
IPC
Process
Shim DIF
over Ethernet
Relaying and
Multiplexing
State
State
State
Vector
Vector
Vector
DataTransfer
Transfer
Data
Data
Transfer
Retransmission
Retransmission
Retransmission
Control
Control
Control
CACEP
RIB
Daemon
Flow Control
Flow Control
Flow Control
RIB
•
Authentication
CDAP
Parser/Generator
Flow Allocation
Resource
Allocation
Enrollment
Routing
Namespace
Management
SDU Protection
Kernel
Shim IPC
Process
Layer Management
Data Transfer Control
SDU Delimiting
Mgmt
Agemt
User
Kernel
IPC API
Data Transfer
System
(Host)
Security
Management
User
The data-transfer parts were going to reside in kernel-space…
6
Layer management & OS processes
•
We decided to keep the layer management functionalities of each IPC
Process Daemon in a separate OS process
– 1 OS process ↔ 1 IPC Process Daemon instance
•
That approach targets at:
– A more “reliable” (SW) solution
• IPC Processes can have problems without interfering each-other (too much)
– A tight work with the OS
• Let the OS do what it is for: manage the resources among its processes
•
•
However, another entity was needed…
IPC Manager:
–
–
–
–
Manages the IPC Processes lifecycle
Broker between applications and IPC Processes
Local management agent
…
IPC Process
IPC Process
Daemon
IPC Process
Daemon
Daemon
N
IPC Manager
Daemon
1
User
Kernel
Kernel
7
Inter-communications …
•
OS Processes request services to the kernel via
syscalls
–
–
•
Modern *NIX systems extend the user/kernel
communication mechanisms
–
•
Netlink
syscalls
Netlink, uevent, devfs, procfs, sysfs etc.
IPC Process
IPC Process
Daemon
IPCP Dmn.
Daemon
Layer mgmt.
We needed a “bus-like” mechanism
–
–
•
Application
Application
Application
Application
Application
User originated (user → kernel)
“Unicast”
User OR kernel originated
Unicast/Multicast/broadcast
syscalls
N
IPC Manager
Daemon
Netlink (& syscalls)
User
Kernel
We adopted syscalls + Netlink
–
Syscalls (fast-path):
•
–
Bootstrapping the IPCP and then SDUs R/W (fast-path)
Netlink (mostly slow-path):
•
1
Kernel
IPCP Dmn.
Data Transfer
1
Management, configuration, notifications …
8
Avoid (major) problems & abstract comms
•
Syscalls are “wrapped” by libc (glibc in OS/Linux)
–
•
Libraries are normally used to “hide” Netlink mechanisms (libnl family)
–
•
•
i.e. syscall(SYS_write, …) → write(…)
However, retaining Netlink details
(quite often) A change in the kernel/user API implies changes in user-space
All applications in the OS are linked to glibc
–
Changes to the syscalls → changes to glibc
• Breaking glibc could break the whole host
–
•
•
•
•
Sandboxed environments are necessary
Dependencies invalidation → Time consuming compilations
That sort of changes are really hard to get approved upstream
…
we introduced librina as the initial way to overcome these problems …
Application
libc
kernel
RINA fn’s
Application
libc
librina
kernel
9
Librina (HL) SW architecture
•
It started as the placeholder for the common functionalities shared among IPC
Process Daemon, IPC Manager Daemon and applications …
… and became (on purpose) a framework
event-based/multi-threaded framework with bindings for interpreted languages (SWIG)
–
• Configure PDU Forwarding Table
• Create / delete EFCP instances
• Allocation of resources to
support a flow
Application
eventPoll()
eventWait()
• Allocate / deallocate flows
• Creation
• Read / write SDUs to flows
• Deletion
• Configuration• Register/unregister to 1+ DIF(s)
eventPost()
common
cdap
faux-sockets
sdu-protection
ipc-process
ipc-manager
application
API
Core components
framework
•
Event Queue
NetlinkManager
librina
NetlinkSession
NetlinkSession
NetlinkSessions
RINA
Manager
nl_send() / nl_recv()
Syscall wrappers
syscall(SYS_*)
libnl / libnl_genl
User
kernel
RINA Netlink
RINA syscalls
10
High level software architecture (1st take)
System (Host)
Security
Management
Namespace
Management
Enrollment
ipcmd
rinad
(C++)
Language X
imports
Third parties
SW Packages
(Applications)
SWIG HL wrappers
(Language X)
Language X “NI”
Core
RIB
Routing
CDAP
Parser/Generator
Resource
Allocation
Authentication
SWIG LL wrappers
(C++, for language X)
API (C)
Flow Control
Flow Control
Flow Control
Retransmission
Retransmission
Retransmission
Control
Control
Control
Data Transfer
Control
IPC
Proces
IPC API
RIB
Daemn.
CACEP
Layer Management
Mgmt
Agent
Flow Allocation
ipcpd
API (C++)
Core (C++)
Shim
IPC
Process
SDU Protection
Relaying and
Multiplexing
Transfer
DataTransfer
Data
Data Transfer
SDU Delimiting
Data Transfer
State Vector
State Vector
State Vector
librina
(C++)
libnl / libnl-gen
Netlink & syscalls
Linux with
RINA
extensions
11
Details on the user space framework
IPC Manager
Daemon
Main logic
DIF allocator
Local
Management agent
RIB & RIB
Daemon
Normal IPC Process
IPC (Layer
ProcessManagement)
Daemon
(Layer Management)
librina
Application A
Application A
Application
Application logic
Netlink
sockets
System
calls
Netlink
sockets
Sysfs
Netlink
sockets
PDU
Forwarding
Table
Generation
RIB & RIB
Daemon
Resource
allocation
Flow
allocation
librina
System calls
Enrollment
librina
System calls
Netlink
sockets
Sysfs
User space
Kernel
•
IPC Manager Daemon
–
–
–
–
•
Manages the IPC Processes lifecycle
Broker between applications and IPC Processes
Local management agent
DIF Allocator client (to search for applications not available through local DIFs)
IPC Process Daemon
–
Layer Management components of the IPC Process (RIB Daemon, RIB, CDAP parsers/generators, CACEP, Enrollment,
Flow Allocation, Resource Allocation, PDU Forwarding Table Generation, Security Management)
12
IPC Manager Daemon
Message
Message
IPC Manager Daemon (C++)
classes
Console
classes
classes
IPC Manager core classes
IPC Process
Manager
Flow Manager
Application
Registration
Manager
Call operation on IPC
Manager core classes
Command
Line
Interface
Server
Thread
Operation result
Call IPC Process Factory, IPC
Process or Application
Manager
local TCP
Connection
CLI Session
Message
Message
classes
Config
classes
classes
Call operation on IPC
Manager core classes
Main event
loop
Bootstrapper
Configuration file
EventProducer.eventWait()
EventProducer.eventWait()
librina
IPC
Process
IPC Process
Factory
Message
Message
classes
Model
classes
classes
Message
Message
classes
Event
classes
classes
Event
Producer
Application Manager
System calls
Netlink Messages
13
IPC Process Daemon
IPC Process Daemon (Java)
Supporting classes
CDAP
parser
Delimiter
Encoder
Layer Management function classes
Enrollment
Task
Flow
Allocator
Resource
Allocator
Forwarding
Table
Generator
Registration
Manager
RIB Daemon
Resource
Information
Base (RIB)
RIBDaemon.
sendCDAPMessage()
RIBDaemon.cdapMessageReceived()
Call IPCManager or
KernelIPCProcess
CDAP
Message
reader
Thread
Main event
loop
EventProducer.eventWait()
KernelIPCProcess.writeMgmtSDU()
KernelIPCProcess.readMgmtSDU()
librina (C++)
KernelIPC
Process
IPC
Manager
System calls
Message
Message
classes
Model
classes
classes
Message
Message
classes
Event
classes
classes
Netlink Messages
Event
Producer
14
State Vector
State Vector
State Vector
IPC API
Flow Control
Flow Control
Flow Control
Retransmission
Retransmission
Retransmission
Control
Control
Control
Data Transfer
Control
RIB
RIB
Daemn.
Namespace
Management
Routing
CDAP
Parser/Generator
Authentication
CACEP
Security
Management
Enrollment
Resource
Allocation
Flow Allocation
Layer Management
Kernel
space
Framework
User
space
Framework
SDU Protection
Relaying and
Multiplexing
Transfer
DataTransfer
Data
Data Transfer
SDU Delimiting
Data Transfer
High level software architecture (2nd take)
ipcpd
PFT
ipcmd
rinad
RNL
RMT
Third parties
SW Packages
SWIG HL wrappers
(Language X)
SWIG LL wrappers
(C++, for language X)
API (C)
API (C++)
Core (C++)
libnl / libnl-gen
syscalls
Netlink
Personality mux/demux
KIPCM
core
KFA
KIPCM
IPCP Factories
Normal IPC P.
shim-eth-vlan
EFCP
shim-dummy
RINA-ARP
15
User/kernel interface: KIPCM + RNL
• interface = syscalls + Netlink messages
• Kernel IPC Manager (KIPCM):
– Manages the syscalls
• Syscalls: a small-numbered, well defined set of calls (#8) :
– IPCs: ipc_create and ipc_destroy
– Flows: allocate_port and deallocate_port
– SDUs: sdu_read, sdu_write, mgmt_sdu_read and mgmt_sdu_write
• RINA Netlink Layer (RNL):
– Manages the Netlink part
• Abstracts message’s reception, sending, parsing & crafting
• Netlink: #36 message types (with dynamic attributes):
– assign_to_dif_req, assign_to_dif_resp, dif_reg_notif, dif_unreg_notif…
• Partitioning:
– Syscalls → KIPCM → “Fast-path” (read and write SDUs)
– Netlink → RNL → “Slow-path” (conf and mgmt)
16
From recursion to iteration: KIPCM & KFA
•
The Kernel Flow Allocator (KFA)
User space
– Manages ports and flows
– Ports
• Flow handler
• Port ID Manager
syscalls
Netlink
KIPCM
KFA
– Flows
• maps: port-id → ipc-process-instance
•
The KIPCM:
– Manages the lifecycle the IPC Processes
– Abstracts IPC Process instances
• Same API for all the IPC Processes regardless
the type
• maps: ipc-process-id → ipc-process-instance
• Recursion in kernel-space considered harmful
Normal
IPCP i/f
EFCP
Shim
IPCP
RMT
PFT
OUT
IN
• They are the point where “recursion” is
transformed into “iteration”
17
Recursion and IPC Processes i/f
• The architecture describes
– the (Normal) IPC Processes
– The Shim IPC Processes
• W.r.t. “DIF stacking”
2
(Normal) IPC
Process
1
(Normal) IPC
Process
0
Shim IPC Process
– Normal IPC Processes
• Have “compatible” NB/SB interfaces
• Have “full-fledged” functionalities
– Shim IPC Processes:
• Have a “compatible” NB interface
• They wrap the technology they are laid
over
– Minimum veneer over legacy
technologies!
Hardware
• They don’t have a “SB” interface
18
Normal & Shim IPC Processes
• The stack provides the implementation of the “normal” IPC
Process
– DTP, DTCP, RMT, PDU Forwarding Table functionalities
• There are currently 4 shims implemented:
– shim-dummy:
• Confined into a single host (“loopback”)
• Used for debugging & testing the stack
– shim-eth-vlan:
•
•
•
•
Runs over 802.1Q
Uses our version of ARP implementation
Offers 1 unreliable QoS cube
VLAN-id = DIF name
– shim-tcp-udp:
• Allows RINA to run “over” TCP/UDP
• Offers 2 QoS cubes:
– Reliable: mapped over a TCP socket (each flow, a different socket)
– Unreliable: mapped over UDP socket (1 socket for all the flows)
19
Shim IPC Processes (cont.)
•
shim-hv:
– Allows the stack to run in virtualised
environments
• QEMU/KVM and Xen
– Works only with “shared memory” buffers
(VMPI/VirtualQueues)
– Offers 1 QoS cube
– This shim is enough to allow RINA to take
advantages of HV environments
• Get rid of software bridges and TCP/IP stack !
20
The Open Source initiative
•
After almost 2 years of continuous development the code-base was made
available as Open Source material on GitHub:
– http://github.com/IRATI
•
It provides the implementation of the following (major) functional blocks:
–
–
IPC Manager daemon
•
•
•
Manages IPC Processes lifecyle
Broker between applications and IPC Process
DIF allocator client (to search for applications not available through local DIFs)
•
Transport and management layers
•
Has routing functionalities (link-state based routing)
IPC Process daemon
–
Provides unreliable and reliable flows functionalities
– A set of shims:
–
–
• shim-eth-vlan
• shim-hv(KVM/Qemu & Xen flavours)
• shim-tcp-udp
• shim-dummy (testing)
A library for building native-RINA applications
A testing/debugging framework
•
•
Regression (runs at build-time, installation-time …)
A testing application: rina-echo-time
21
Ongoing works …
• The stack:
– Implements the core functionalities of the RINA architecture
– Its policies are hardwired ...
• ... we were in the need of enabling the customization
capabilities provided by the architecture
• Leverage on the stack, maturing a RINA SDK
– Define the API for each SW component having a policy
– Allow extension modules to be plugged in and out of the
prototype
– Allow to dynamically load & accept changes on its
behaviours at runtime
22
Pluggin’ policies, places
RcvrInactivityTimer
SndrInactivityTimer
InitialSequenceNumber
TransmissionControl
Authentication
RTTExtimation
SenderACK
MonitorNMinus1Flow
NMinus1FlowDown
IPC API
Data Transfer
Relaying and
Multiplexing
Checksum
Compression
Encryption
TTL
State
Vector
State
StateVector
Vector
DataTransfer
Transfer
Data
Data
Transfer
Layer Management
Data Transfer Control
SDU Delimiting
MaxQ
RMTQMonitor
RMTScheduling
NewFlowRequest
AllocateRetry
Retransmission
Retransmission
Retransmission
Control
Control
Control
CACEP
Authentication
RIB
Daemon
CDAP
Parser/Generator
Flow Control
Flow Control
Flow Control
RIB
Namespace
Management
User
Resource
Allocation
Enrollment
Routing
SDU Protection
Kernel
Flow Allocation
RoutingAlgorithm
Security
Management
NewMemberAccessControl
NewFlowAccessControl
RIBAccessControl
• ... policies are in both spaces ...
23
RINA Plugins Infrastructure
•
The RINA Plugins Infrastructure (RPI)
•
•
Plugin = policy code + framework
The “framework” is ... all the functionalities required to use custom
policies in the stack:
– Workflow: Load, plug, select, unselect, unplug and unload
•
Since the stack is split in two halves ...
–
RPI must comply with both kernel and user spaces characteristics ...
•
... RPI must be split in two as well:
•
Policy set = A set of policies (in the same SW component) that can
share state
– Kernel RPI (kRPI) → leverages on LKM
– User RPI (uRPI) → leverages on SO
– This way: different policies - in the same component - can share state in a
plugin-specific way
24
Components addressing
•
Address of an IPC Process component in a processing system:
• IPC Process ID (uint)
• Path in the IP Process component tree
•
Example:
• Custom passwd policy-set for Security Manager is addressed by
• Security-manager.passwd
25
... and routing (between spaces) ...
• Commands (e.g. select a behaviour or set a value)turned into
Netlink request messages
• Requests routed to user-space or kernel-space
– depending on the addressed component
• Response messages received back
26
… Next steps
• Improvements and new functionalities:
– Short terms:
• Export a subset of the policies
– Medium terms:
• Consolidate the RPI framework & export a larger set of policies
• Librina
– Bindings for interpreted languages (Java, Python)
– Subsetting: librina-rib, librina-application, …
• A RINA Traffic generator
• A multi-node configuration building tool
– Medium/long-terms:
• Implement a Management Agent
• Minimise user-/kernel- spaces differences w.r.t. writing policies
• (in parallel)
– keep the implementation in-sync with the specs
– Hardening , cleanup, increase performances, reduce memory
consumption, …
27
Have a look && join us!
http://irati.github.io/stack
https://github.com/IRATI
http://www.freelists.org/list/irati
Thanks!
28

Similar documents