LIO and the TCMU Userspace Passthrough: The Best of Both Worlds
Transcription
LIO and the TCMU Userspace Passthrough: The Best of Both Worlds
LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover <[email protected]> @iamagrover March 11, 2015 1 LIO AND TCMU What is LIO? Multi-protocol in-kernel SCSI target 2 LIO AND TCMU Multi-protocol in-kernel SCSI target Unlike other targets like IET, tgt, and SCST, LIO is entirely kernel code. 3 LIO AND TCMU targetcli rtslib Configuration User Kernel Fabrics /sys/kernel/config/target iscsi tcm_fc block LIO core file pscsi iSCSI commands 4 Backstores LIO AND TCMU /dev/vg0/vol0 disk.img /dev/sdc Why add userspace command handling? ● Enable wider variety of backstores without kernel code ● ● ● ● 5 Clustered network storage filesystems like Ceph, GlusterFS, & other things that have shared libraries available File formats beyond .img, such as qcow2 & vmdk for more interesting features & compatibility SCSI devices beyond mass storage Enable experimentation just like FUSE did for filesystems LIO AND TCMU Userspace handling challenges: perf 6 ● I/O latency ● I/O throughput ● Parallelism within a dev (OoO cmd completion) ● Parallelism across devs, we're good. LIO AND TCMU Userspace handling challenges: usability 7 ● Configuration as simple as existing backstores ● Userspace daemon failure ● Userspace daemon activation/restart ● Balance ultimate flexibility with common use ● Avoid “multiple personalities” ● Reasonable resource usage LIO AND TCMU SCSI Command processing cmd handling daemon targetcli rtslib Configuration uio0 uio1 User Kernel Fabrics /sys/kernel/config/target iscsi tcm_fc tcm-user block LIO core file pscsi iSCSI commands 8 Backstores LIO AND TCMU !! W W E E NN /dev/vg0/vol0 disk.img /dev/sdc User/Kernel Communication Shared Memory Region Layout cmd handling daemon (not to scale) Mailbox Mailbox cmdr_off cmdr_size cmd_head ● ● dev add/remove (netlink) ● read (wait for cmd) write (cmds done) mmap (get SMR) ● read to get configuration cmd_tail Command CommandRing Ring cmd_entry opcode /sys/class/uio/uio0 uio0 ... /dev/uio0 iovec[] Data DataArea Area ... in/out data tcm-user from LIO core 9 LIO AND TCMU tcm-user merged in 3.18 ● ● ● 10 Initial patch as simple as possible Performance tuning not done, BUT an interface flexible enough to not block likely perf optimization strategies Acceptance enabled phase 2: userspace usability pieces LIO AND TCMU Performance Opportunities for Later ● Larger vmalloc()ed shared mem region ● -> Demand-allocate pages to avoid bloat ● ● Complete commands out of order ● ● ● 11 Must handle data area fragmentation Fabrics copy into already-mapped pages ● ● Block size == PAGE_SIZE preferred Just moves the cache misses? Fabrics don't have per-lun device queues anyway. Just IB reimplemented poorly? Userspace busywait on ring cmd_head LIO AND TCMU Our user-kernel API ended up flexible, but fraught with danger! ● Ring operations easy to mess up ● Make every handler write daemon boilerplate? ● And support Netlink? ● And maybe D-Bus??? 12 LIO AND TCMU tcmu-runner: A standard handler daemon ● Handle the messy ring bits ● Expose a C plugin API ● ● Implement library routines for common handler code, e.g. mandatory SCSI commands Permissively licensed: Apache 2.0 ● ● 13 Doubles as sample code for do-it-yourself-ers Needed a prototype daemon in any case LIO AND TCMU tape targetcli rtslib vmdk mmc smc tcmu-runner Configuration uio0 glfs !! W W NNEE User uio1 Kernel Fabrics /sys/kernel/config/target iscsi tcm_fc tcm-user block LIO core file pscsi iSCSI commands 14 Backstores LIO AND TCMU /dev/vg0/vol0 disk.img /dev/sdc Handlers Need Config Info ● ● ● ● All configuration should still be through standard LIO mechanisms (i.e. targetcli) User backstore create includes a URL-like “configstring” that gives handler and per-device handler-specific stuf This is published in uio sysfs, and netlink add_device message Also has info to allow going from uio dev back to matching LIO backstore ● 15 Handler needs attribs, block size, etc. LIO AND TCMU Config tools need Info on Handlers Too! ● ● ● ● ● 16 Users should call “backstores/foo create x y z” not “backstores/user <configstring>” targetcli needs param and help strings for xyz Verify backstore params are correct before creating the device Must be loosely coupled – no hard dependencies on targetcli or tcmu-handler Solution: D-Bus! LIO AND TCMU !! W W E E NN DBus targetcli rtslib tape vmdk mmc smc glfs tcmu-runner Configuration uio0 User uio1 Kernel Fabrics /sys/kernel/config/target iscsi tcm_fc tcm-user block LIO core file pscsi iSCSI commands 17 Backstores LIO AND TCMU /dev/vg0/vol0 disk.img /dev/sdc The QEMU Question ● QEMU has great support for many image formats and other backstores ● Can we reuse or integrate somehow? ● How? ● ● ● 18 Build qemu handler code separately and integrate as a tcmu-runner handler? Extend qemu to implement TCMU directly, and possibly also enable it to configure LIO exports? Deferred for now LIO AND TCMU Getting Involved ● Give feedback! ● ● ● ● 19 [email protected], [email protected] Check out tcmu-runner: https://github.com/agrover/tcmu-runner and its included sample handlers. ● Use github PRs and issue tracking. ● Much help needed, esp. QEMU hackers! Start doing some TCMU performance benchmarking Start thinking of interesting device types, userspace libraries to use, or weird things to do with a SCSI command sandbox LIO AND TCMU Thanks! Questions? 20 LIO AND TCMU lsmcli (libstoragemgmt) targetcli Remote Local targetd rtslib liblvm tcmu-runner User Kernel /sys/kernel/config/target Fabrics iscsi tcm_fc tcm-user block LIO core file pscsi ISCSI commands 21 LIO AND TCMU !! W W NNEE /dev/vg0/vol0 disk.img /dev/sdc User/Kernel Communication Shared Memory Region Layout (not to scale) tcmu-runner Mailbox Mailbox cmdr_off cmdr_size cmd_head ● ● ● read (wait for cmd) write (cmds done) mmap (get SMR) ● read to get configuration cmd_tail Command CommandRing Ring cmd_entry opcode /sys/class/uio/uio0 uio0 ... /dev/uio0 iovec[] Data DataArea Area ... in/out data tcm-user from LIO core 22 LIO AND TCMU SCSI A set of standards for physically connecting and transferring data between computers and peripheral devices* * http://en.wikipedia.org/wiki/SCSI 23 LIO AND TCMU SCSI target Initiator sends commands, Target handles them 24 LIO AND TCMU Multi-protocol SCSI target SCSI commands & data can be sent over many types of physical links and protocols. e.g: ● ● ● ● ● ● ● 25 SCSI Parallel Interface (original) iSCSI (over TCP/IP) SAS (over SATA cables) Fibre Channel (over FCP) FCoE (over Ethernet) SRP (over Infiniband) SBP-2 (over Firewire) LIO AND TCMU tgtadm Configuration libglfs tgtd !! W W NNEE librbd User iSCSI commands from initiator Backed by local files or block devices disk.img 26 LIO AND TCMU Kernel /dev/vg0/vol0 gg n i n i ! m m o CCo oooonn! SS others qcow2 vmdk librbd libglfs vdi rbd* glfs* qemu-lio-tcmu tcmu-runner User Kernel /sys/kernel/config/target Fabrics iscsi tcm_fc tcm-user block LIO core file pscsi ISCSI commands 27 LIO AND TCMU /dev/vg0/vol0 disk.img /dev/sdc