Key Competence in Computer Science
Transcription
Key Competence in Computer Science
Key Competence in Computer Science Schlüsselqualifikationen für Informatiker Stefan Klinger Databases and Information Systems University of Konstanz Winter 2015 0 · Prelude 0.1 I I What this is all about Having fun with cool software! Show you the UNIX toolbox: • • • • • • • • I What this is all about · 0.1 Unix-like environments & the shell. The usual command-line suspects (e.g., GNU coreutils). Editors, and text encoding. Writing papers with LATEX. Secure Shell, and cryptography. Shell scripting. Source code management with Subversion. ... Not in this order! 1. Short term: Use LATEX and Subversion to hand in your exercises. 2. Long term: Become a proficient (Unix) user. (cf. page 13) This course is ... I ...“the same” as last semester. I ...not an official “Schlüsselqualifikation” course. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 2 0 · Prelude 0.2 Personnel · 0.2 Personnel Prof. Marc Scholl Chair for Databases and Information Systems (DBIS) web http://dbis.uni-konstanz.de/ office PZ811 Stefan Klinger I give this Lecture mail [email protected] office PZ804 Claudia Bartholt Tutor mail [email protected] pool V304, Do 15:15–17:00 Benjamin Stauss Tutor mail [email protected] pool V304, Fr 15:15–17:00 Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 3 0 · Prelude 0.3 Coordinates · 0.3 Coordinates material https://svn.uni-konstanz.de/dbis/sq_15w/pub/ This will be updated on a regular basis. lecture Monday, 10:00–11:30, M629 tutorials There are no tutorials. I The tutors attend the pools regularly (cf. previous slide). I It is not mandatory to see the tutors, only if you need help. credits One very simple assignment every week. I Released every Monday, due on the next Monday, 9:45 a.m. I To pass, you need to achieve 50% of all exercise points, and not more than 3 assignments may be graded < 10%. I You will work on the exercises in teams, cf. next slide. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 4 0 · Prelude Coordinates · 0.3 How much work is it? I This course: 3c I ECTS1 I Rule of thumb: Double the lecture times for your homework. Unit c for credits. says: 1c ≡ 30h Unit h for hours. This semester: 15w Unit w for weeks. 3c · 30 hc h = 6 15w w I 2 wh for the lecture (actually, it’s only 1.5h), plus I 4 wh to post process the lecture, get help from the tutors, and solve the exercise. 1 http://en.wikipedia.org/wiki/European_Credit_Transfer_and_Accumulation_System Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 5 0 · Prelude 0.4 Registration · 0.4 Registration — Important This week Form groups of two as described in the 1st assignment2 . Deadline: next Monday. “Prüfungsanmeldung” You have to sign up (binding!) for this course via StudIS3 , during the registration period4 . cf. Information by faculty 2 https://svn.uni-konstanz.de/dbis/sq_15w/pub/assignment01.pdf 3 https://studis.uni-konstanz.de/ 4 http://www.informatik.uni-konstanz.de/studieren/studium/ pos-pruefungsinformationen/pruefungsanmeldung/ Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 6 1 Toolbox basis https://en.wikipedia.org/wiki/Live_Free_or_Die 1 · Toolbox basis 1.1 Why use Unix? · 1.1 Why use Unix? I Unix [pl. Unices] — rather a generic term today, dates back to 1969 at the AT&T Bell Labs. I The trademark UNIX is owned by The Open Group. Must be used solely for systems certified according to the Single UNIX Specification. Similar systems manifest the family of unixoid or unix-like operating systems, e.g., I • • • • • GNU/Linux Android the BSD family Mac OS X ... Standardising documents: POSIX, Linux Standard Base, etc. ⇒ The “Unix Idea” is wide-spread and well-established. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 8 S 4.3.5 NetBSD Unix System NetBSD 0.9 iOS 5 iOSV iOS 5.1Why use march0.8 1, 1999 Unix? · 1.1 1989 october 12, 2011 april 20, 1993 (S800) 4.1ES august 23, 1.1 1993 may 7 march 2012 Release BSD Mach 2.5 7, SunOS SunOS SunSoft UNIX 1.2 1.3.3 Net/1 NetBSD 1.4. 2.0 H1 1991MachNetBSD december 199 (4.3BS 1988 april 1984 Interactive 4.1 december 23, 1998 UniSoft august 26, 199 january 1985 er 1988 june FreeBSD 1.0 1992 UniPlus NIX Time-Sharing System nix SVR3.2 Microport Unix SVR4.0 Microport december 1993 5.1 NetBSD 1.4 Unix SVR4.1 Silver OS OpenBSD m System Vmt Xinu Ninth Edition (V9) HP-UX 9.0 (S800) OpenBSD 5.0 8.02 (S800) more/BSD HP-UX may 12, 1999 HP-UX 8.06 (S800) HP-UX 8.0 (S300/S800) july 10, 2004 mach386 may 1, Sinix 2.12012 1983 september 1986 october 7, 1992m november 1, 2011 Sinix 2.0 december 1988 NetBSD 1.6.2 august 5, 1991 386 BSD 0.1 H2 1991 march 27, 1991 FreeBSD 3.029, 2004 1988SVR4 Issue 2 february Dell Unix 3.2 ekkoBSD 1.0 BETA FreeBSD 2 july 14, 1992 FreeBSD 3.1 1987 Dell Unix SVR4.0 october 16, IRIX 1998 2.11BSD IRIX 2.08.04 (S800) ekkoBSD BETA 2 1999 february 15, 1999 1992 IRIX 3 july 7, 2004 may 18, HP-UX BLS 1990 4.3BSD-Quasijarus0 patch 100 1986 NonStop-UX november 18, 1987 june 10, Darwin 11.1 february 18, 2004 DragonFlyH2 1991 DragonFly december 27, 1998 Dar1 Darwin 11.2 DragonFly BSD 1.0-RC1 2.10.1BSD january 1993 Darwin 11.3 april 10, 1987 august 25, DragonFly BSD (beta) Linux 0.02 BSD 1.0 0.12 BSD 1.0A Linux 0.95 0.01 OpenSolaris 2008.05 Linux Linux FreeBSD 2.2.8 1989 may october 12, 2011 june 28, 2004 january february 6, 2012 SCO XENIX 2011 march october 5, 1991 MachSys julyjanuary 12,V/386 2004 july 15, 2004march 8, 1992 SCO XENIX System november 29,(Indiana) 1998 5, 2004august 1, 1991 16, 1992 V/386 release 2. SCO UnixWare 7.1.4 4.10 FreeBSD may 5, 2008 1985 9,0 october 1987 IRIS&GL2 5.0 MidnightBSDjune 0.1 PS/2 AIX/370 HP-UX 0OS XAIX HP-UX 8.07 4.3BSD-Quasijarus0c 1989 MIPS OS Maintenance Pack 4 2004 HP-UX 8.01 (S700) 10.6.8 HP-UX 8.05 (S300/700) may 27, IRIS GL2 6.0 dec. 1986 august 5, 2007 1.2.1 (S300/S70 june 11,2008 4.4BSD february415, 2004 RISC/os AndroidUNIX 1.0 Time-Shar february 1991 .3.01 4.4BSD(S300/S700) Encumbered v1.1 july 1991 1987 february 22, GNULite (Trix) 4.4BSD 21991 november 2,a1 Eighth Editio june 1, 1993 november 21, 1991 september 23, 2008 2007 june 1993 y 25, 2011 Venix 3.2.4 1986 OpenBSD 2.5 DragonFly BSD february 1 4.3BSD Reno SD 2.3 OpenBSD 2.4 Drago n 9 HPBSD 2.0 OS X Lion OS X Lio OS X Lion Sinix may 19, 1999 OS X Lion DragonFly BSD Minix 1.5 9, december 1, 1998 1.10.0 june 1990 V/386 y 1998 BSD 1.8.02.6.25 1 Linux Linux Linux 2.6.27 UNIX System april 19932.6.26 UNIX System 10.7.2 10.7.4 10.7.1 10.7.3 1.8.1 december 1992 SD 5.2 6, 20079, 2008augusF 30,april 200717, FreeBSD 5.2.1 AIX/ESA AIX/ESA 1 13, 2008System 2008 Release 3.2 2.1 augustoctober V/386 rel 20, 3.0 october 12, 2011 july NX/Neutrino may 9, 20 august 2011 UNIX V UNIX Sy OS 10.6.82.0 february 1, 2012 m VX march 27, 2007 12, 2004 february 25, 2004 1992 QNX 4.0 DesktopBSD 1.6-RC3 1991 MirBSD Release#7quater 3.2 QNX 2.21 Relea rver1998 v1.13.1 .0 AIX 3.2 AIX 1990 july 2007 SCO XENIX june 14, 2004 OS X25, Mountain RISC Unix Lion 1987 198 y 25, 2011 SCO XENIX 3.0 Acorn XENIX 3.0 1990 OpenBSD 3.5 Coherent 4.0S 6.2 1990 System Darwin 0.21984 Darwin 0.1 betaDarwin 0.3 V/286 198810.8 february april 1983 4.1.3 SunOS 4.1.3_U1 02007SunOS may 1,IRIS 2004 G may 1992 may 13, 1999 march 16, 1999 august 16, 1999 1985 february 16, 2012 1.1a)2.0 ( IRIS GL2 1.5 A/UX A/UX 3.0 (Solaris 1.1.1) es (Solaris SunOS 4. GL2 1.0 Chorus/MiX V3.2 augustSunOS 1992 1993 OS X (DP1) SunOS 4.0 Oracle Solaris SunOS10 4.0.3 4.1 june 1990 8/111983 april 16, 1992 december aug QNX 6.3 Mac mid-1984 AIX PS/2 1.3 (Solaris 1988 may 10, 1999 march 1990 Solaris 2.3 (sparc) Solaris 2.2 (sparc) 1989 may september 15,1989 2011 Solaris 2.1 june 3, 2004 october 2, 1991 z/OS Unix november OpenDarwin PureDarwin Unicos 2.0 5.1) Unicos (SunOS7.2.1 5.3) september 2 (SunOS3.0 5.2) (SunOS Solaris 719, 1986 UNIX System ju july 16, 2004 Solaris2007 7, 3/99 may25, december september 1987 Solaris 7, 8/99 Solaris 7, 5/99 november 1993 Xinu 7 1993 december 1992 (SunOS 5.7) march 1999 augustV/286 1999 may V 1999 marchBlackBerry 1988 octoberNeXTSTEP 27, 1998 BlackBerry BBX UNIX 10 OpenBSD 4.1System UNIX System V 3.0 NeXTSTEP 3.2 1985 NeXTSTEP 3.1Release Darwin 7.4 Darwin 7.3HP-UX 11.23/11iv2/0806 CTIX/386 CTIX 3.0 (announced) 2 .2 (announced) may 1, 2007 Mac OS X Server 1.0 psody DR2 september Mac 1993 OS X ServeriPhone 1.0.2 OS 1 january 1983 1992 march 15, 2004 october may 25, 1993 may 26, 2004 june 2008 iPhone OS 1.0 october 2003 may 1,1999 2012 NeXTSTEP 2.04.1 au 16, 1999 april 1984 1.018, 2011 marchMac ay, 1998 july 22, UNIX Solaris 2.0 (x86) NeXTSTEP OS X 10.4 (Tiger beta) Solaris 2.1Interactive (x86) august 21,32 june 29, 2007 HP-UX 18, 1990 11.31/11iv3 Update 18, 1989 ate 2 (0803) Microport Unix V/386 1988 end 1992 septemberMac Mac OS Xsept. 10.3.5 Dynix june 28, 2004 OS X 10.3.4 Oracle Solaris 11 laris 2.5.1 Mac OS X 10.3.3 september 2008 september 1987 august 9, 2004 1984 HP-UX 2.0 HP-UX 2.1 may 26, 2004 november 9, 2011AppleTV 1.1 AppleTV 1.0 march 2004 er 1998 HP-UX 1.015, (S800) HP-UX 1.1 HP-UX (S800) march 2007 2.6 1.2 Mach 3 (S800) june 20,Mach 2007 november 20,21, 1986 Mac OS X 10.3.4 (S800)CXOsMac march OS X 10.3.5 Darwin 8.9 (S800) OpenIndiana 1988 july 1988 Mac OS X 10.3.3 Server Tru64 Unix V5.0 TS 5.2 Tru64 Unix V4.0F august 17, 1987 Serverapril nov. 16, 1987 Server 17, 2007 build august 12, 1999 1984 UNIX151a Time-Sharing System march 15, 2004 1983 february 1, 1999 Sinix 5.40 Sinix 5.41 SPIX 32 september may 26, 2004 august 9, 2004 14, 2011 Mac OS X 10.4 Server OSF/1 July 2013 from http://www.levenez.com/unix/ 1992 IRIX 6.5.2 Tenth Edition (V10) 1993 HP-UX 6.2 RIX 6.5.1M HP-UX 5.2 HP-UX Xinu 6.0 Macbeta) OS X 10.4.10 ac OS XHP-UX 10.4.9 (Tiger 1990 october 1989 5.1 (S200/S300) 6.5.4 5.0juneIRIX IRIX 6.5.3 IRIX Sinix 5.20 1984 IRIX 6.5.5 IRIX 4.0.4 november 17, 1998 AIX ust 14,13, 1998 IRIX 5.1 (S300) (S300) (S300) 20,11,2007 6.1 TL1 arch 2007 june 28,·2004 Stefan Klinger · H1 DBIS Key Competence in Computer Science Winter 2015 may 1999 february 9, 1999 1986 august march 1993 1990 6, 1999 9 HP-UX 7.08basis t25, 17, 1998 1 ·2011 Toolbox 2004 Unix history 2007 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 UNIX Time-Sharing System Seventh Edition Modified (V7M) december 1980 1982 1983 V7M 2.1 october 1981 1984 1985 Ultrix 32M 1.0 1984 2.79BSD april 1980 2.8BSD july 1981 1986 2.8.2BSD september 8, 1982 2.8.1BSD january 1982 2.9BSD july 1983 1987 1988 Ultrix-11 v3.1 1986 Ultrix-11 v3.0 1986 Ultrix-11 1989 Ultrix 4.2 Ultrix 32M 2.0 1987 Ultrix 32M 1.2 1985 mt Xinu (4.2BSD) mt Xinu july 19, 1983 2BSD may 10, 1979 1BSD march 9, 1978 UNSW 01 january 1978 UNIX 32V may 1979 BRL Unix V4.1 july 1979 LSX UNIX Time-Sharing System Second Edition (V2) june 12, 1972 UNIX Time-Sharing System Third Edition (V3) february 1973 UNIX Time-Sharing System Fourth Edition (V4) november 1973 UNIX Time-Sharing System Fifth Edition (V5) june 1974 UNIX Time-Sharing System Sixth Edition (V6) may 1975 4.1aBSD april 1982 4.1BSD june 1981 4.1bBSD august 1982 UniSoft UniPlus V7 1981 UNSW 81 april 1981 V7appenda february 12, 1980 Tunis 1981 QNX beta 1983 SunOS 1.0 february 1982 PWB 1.0 july 1, 1977 PWB 2.0 1978 PWB 1.2 CB UNIX 2 CB UNIX 1 USG 1.0 SunOS 1.1 april 1984 UniSoft UniPlus System V 1983 SunOS 2.0 may 15, 1985 SunOS 3.2 september 1986 SunOS 3.0 february 17, 1986 BRL Unix (4.2BSD) 1985 TS 4.0.1 1981 SCO XENIX 3.0 february 1984 IRIS GL2 1.5 mid-1984 SCO XENIX System V/286 1985 Plan 9 UNIX System V Release 3.0 1986 CXOs 1984 Xinu 1984 TS 5.2 1983 IS/3 CXOs 0.9 sept. 1985 Interactive 386/ix 1985 IS/5 HP-UX 1.0 (S500) H1 1983 HP-UX 2.0 (S500) H1 1984 HP-UX 2.1 (S500) september 1984 HP-UX 2.0 (S200) august 1983 Locus 1985 Venix Chorus 1986 Unicos 1.0 april 3, 1986 Venix 1.0 1985 IRIS GL2 5.0 dec. 1986 GNU (Trix) 1986 UNIX System V/386 rel 3.0 Unicos 2.0 december 19, 1986 HP-UX 5.0 (S200/S300) H1 1986 Venix 2.0 1986 HP-UX 1.0 (S800) november 20, 1986 SPIX 32 HP-UX 5.1 (S200/S300) H1 1986 Note 1 : an arrow indicates an inheritance like a compatibility, it is not only a matter of source code. Note 2 : this diagram shows complete systems and [micro]kernels like Mach, Linux, the Hurd... This is because sometimes kernel versions are more appropriate to see the evolution of the system. Coherent june 1983 AIX/RT 2 1986 Acorn RISC Unix 1988 Xinu 7 march 1988 Venix/286 Minix 1.0 1987 AIX/RT 2.1.2 HP-UX 6.2 (S300) june 1988 Venix 3.2 AIX/RT 2.2.1 1987 HP-UX 3.0 (S800) nov. 11, 1988 CTIX 4.0 HP-UX 7.0 (S300/S800) H2 1989 Dell Unix SVR1.1 november 1, 1989 HP-UX 7.06 (S800) H2 1990 Venix 3.2.3 AIX/6000 v3 1989 A/UX 1.0 february 1988 AIX 3.1 1990 AIX 3.2 1990 A/UX 2.0 june 1990 HP-UX 8.01 (S700) february 1991 Venix 3.2.4 SunSoft UNIX Interactive 4.1 1992 HP-UX 8.02 (S800) august 5, 1991 Linux 0.02 october 5, 1991 HP-UX 8.05 (S300/700) july 1991 AIX/ESA 1 1991 BSD/OS 2.0 (BSDI) january 1995 AIX/ESA 2.1 1992 A/UX 3.0 april 16, 1992 Coherent 4.0 may 1992 SunOS 4.1.3_U1b (Solaris 1.1.1B) february 1994 AIX 3.2.4 july 1993 A/UX 3.0.1 1996 BSD/OS 2.1 (BSDI) february 13, 1996 1997 BSD/OS 3.0 (BSDI) february 26, 1997 FreeBSD 2.1.5 july 14, 1996 FreeBSD 2.1.6 november 16, 1996 FreeBSD 2.1.7 FreeBSD 2.2.1 february 20, 1997 march 25, 1997 2.11BSD patch 366 february 1997 OpenBSD 2.0 october 1996 FreeBSD 2.2.5 october 22, 1997 OpenBSD 2.1 june 1, 1997 Linux 1.0.9 april 17, 1994 AIX/ESA 2.2 1994 A/UX 3.0.2 OpenBSD 2.3 may 19, 1998 Dynix/ptx 4.4 1996 AIX 4.1 august 12, 1994 Linux 1.3 june 12, 1995 Unicos/mk 1.2.5 november 11, 1996 AIX 4.1.1 october 28, 1994 A/UX 3.1 Coherent 4.2.10 1995 AIX 4.1.3 july 7, 1995 A/UX 3.1.1 1995 AIX 4.1.4 october 20, 1995 OPENSTEP 4.1 december 1996 UnixWare 2.1.2 october 1996 Unicos/mk 1.3 december 9, 1996 Unicos 9.1 march 15, 1996 OPENSTEP 4.2 january 1997 AIX 4.1.5 november 8, 1996 2000 BSD/OS 4.1 (BSDI) december 20, 1999 Mach 4 1998 OpenBSD 2.5 may 19, 1999 FreeBSD 3.5 june 24, 2000 IRIX 6.5.1M IRIX 6.5 august 14, 1998 june 15, 1998 NonStop-UX C50 june 3, 1998 OpenServer 5.0.5 august 12, 1998 Solaris 8 (beta) nov 2, 1999 IRIX 6.5.3 february 9, 1999 UnixWare 7 Unix System V Release 5 march 3, 1998 Xinu 8 1998 UNIX Interactive 4.1.1 july 21, 1998 OS/390 Unix V2R5 march 27, 1998 UnixWare 7.0.1 september 8, 1998 Unicos/mk 2.0.3 may 1998 Dynix/ptx 4.4.4 1998 Unicos 10.0.0.2 Unicos 10.0.0.3 may 1998 october 1998 UnixWare 7.1 february 23, 1999 Unicos/mk 2.0.4 january 25, 1999 Unicos 10.0.0.4 february 1999 OS/390 Unix V2R7 march 26, 1999 HP-UX 11.0 9905 may 1999 MkLinux Pre-R1 1999 Linux 2.0.37 june 14, 1999 Linux 2.1.132 december 22, 1998 Minix 2.0.0 january 1997 AIX 4.3 october 31, 1997 AIX 4.3.1 april 24, 1998 Dynix/ptx 4.5 1999 Unicos 10.0.0.5 may 1999 Unicos/mk 2.0.5 october 18, 1999 NonStop-UX C52 april 20, 2000 IRIX 6.5.8 may 22, 2000 Tru64 Unix V5.1 august 2000 FreeBSD 4.4 september 19, 2001 Minix 2.0.2 december 1998 Monterey (announced) october 1998 AIX 4.3.2 october 23, 1998 Linux 2.2.0 january 26, 1999 Linux 2.2.8 may 11, 1999 UnixWare 7.1.1+LKP august 21, 2000 Linux 2.3.14 august 19, 1999 OS/390 Unix V2R9 march 31, 2000 MkLinux R1 december 11, 1999 Linux 2.3.51 march 10, 2000 HP-UX 11.10 march 2000 Linux 2.4.0 test8 september 8, 2000 OS/390 Unix V2R10 september 29, 2000 Darwin 5.1 Mac OS X Mac OS X 10.1.2 10.1.1 nov 13, 2001 dec 20, 2001 Solaris 8 10/01 october 2001 Mac OS X Server 10.1.1 november 21, 2001 Mac OS X Server 10.1 september 29, 2001 Darwin 5.2 Darwin 5.3 Mac OS X 10.1.3 february 19, 2002 Linux 2.2.12 august 26, 1999 Mac OS X Server 10.1.3 february 20, 2002 Linux 2.2.16 june 7, 2000 Linux 2.2.17 september 4, 2000 IRIX 6.5.15 february 6, 2002 IRIX 6.5.13 IRIX 6.5.14 november 7, 2001 august 8, 2001 NonStop-UX C53 october 19, 2001 Debian GNU/Hurd G1 Debian GNU/Hurd H2 october 10, 2001 december 4, 2001 OpenServer 5.0.6a june 8, 2001 Linux 2.2.18 december 11, 2000 Open UNIX 8 MP1 Release 8.0 august 8, 2001 Open UNIX 8 Release 8.0 june 11, 2001 Linux 2.4.3 march 30, 2001 Linux 2.4.5 may 25, 2001 Open UNIX 8 MP2 Release 8.0 november 6, 2001 Debian GNU/Hurd H3 february 26, 2002 Unicos/mk 2.0.6 january 2002 2003 BSD/OS 4.3.1 december 21, 2002 2004 BSD/OS 5.0 may 2, 2003 NetBSD 1.6.1 april 14, 2003 DragonFly BSD july 16, 2003 FreeBSD 4.6.2 august 15, 2002 Linux 2.4.15 november 23, 2001 S-E Linux 2.0 september 26, 2001 HP-UX 11.11/11iv1/0109 september 2001 Linux 2.4.17 december 21, 2001 z/OS Unix V1R2 october 26, 2001 HP-UX 11.11/11iv1/0112 december 2001 Linux 2.5.0 november 23, 2001 HP-UX 11.20 aka 11iv1.5 (IA) june 2001 Linux 2.2.20 november 2, 2001 MirBSD august 29, 2002 MirBSD #0 october 11, 2002 QNX 6.2 (patch A) october 18, 2002 Darwin 6.0.1 sept. 23, 2002 Darwin 5.5 Mac OS X 10.2 (Jaguar) august 13, 2002 Mac OS X 10.2.1 sept. 18, 2002 Mac OS X Server 10.2.1 sept. 18, 2002 Mac OS X Server 10.2.3 december 19, 2002 Mac OS X Server 10.2.2 november 11, 2002 Solaris 9 OE 12/02 december 2002 Solaris 9 OE 9/02 sept. 2002 Unicos/mp 1.0 august 23, 2002 NonStop-UX C61 october 2, 2002 OpenServer 5.0.7 (announced) august 26, 2002 Debian GNU/Hurd J2 october 10, 2002 Mac OS X Server 10.2.4 february 24, 2003 Mac OS X Server 10.2.5 april 14, 2003 Solaris 9 x86 PE february 6, 2003 Linux 2.5.10 april 24, 2002 HP-UX 11.11/11iv1/0206 june 2002 Linux 2.5.18 may 25, 2002 Unicos/mp 2.0 december 20, 2002 GNU/Hurd-L4 (announced) november 18, 2002 Unicos/mp 2.1 march 17, 2003 IRIX 6.5.19 february 5, 2003 NonStop-UX C62 january 17, 2003 OpenServer 5.0.7 february 24, 2003 Debian GNU/Hurd K2 march 3, 2003 Debian GNU/Hurd K1-Unstable december 12, 2002 Darwin 7.0 Preview june 25, 2003 Mac OS X 10.3 beta (Panther) june 23, 2003 Linux 2.5.44 october 19, 2002 Linux 2.2.23 november 29, 2002 Linux 2.5.52 december 15, 2002 Linux 2.5.62 february 17, 2003 Linux 2.2.24 march 5, 2003 Linux 2.5.65 march 17, 2003 AIX 5L 5.0 october 24, 2000 Minix-VMD 1.7.0 november 9, 2000 AIX 5L v5.1 may 4, 2001 Darwin 6.7 sept. 22, 2003 IRIX 6.5.21 august 6, 2003 Darwin 6.8 sept. 22, 2003 Mac OS X 10.2.8 september 22, 2003 Mac OS X Server 10.2.8 september 22, 2003 Darwin 7.0 october 24, 2003 Mac OS X 10.3 october 24, 2003 Mac OS X 10.3 Server october 24, 2003 FreeBSD 5.2 january 12, 2004 Darwin 7.3 march 15, 2004 Darwin 7.2 december 19, 2003 Mac OS X 10.3.3 march 15, 2004 Mac OS X 10.3.2 december 17, 2003 Mac OS X 10.3.2 Server december 19, 2003 Mac OS X 10.3.3 Server march 15, 2004 Mac OS X 10.3.4 may 26, 2004 Mac OS X 10.3.4 Server may 26, 2004 Linux 2.4.23 november 28, 2003 HP-UX 11.11/11iv1/0309 septembre 2003 HP-UX 11.23 aka 11iv2 (IA) september 2003 Linux 2.6.0 december 17, 2003 Linux 2.6.1 january 8, 2004 Linux 2.4.24 january 5, 2004 HP-UX 11.11/11iv1/0312 december 2003 Linux 2.6.4 march 10, 2004 Linux 2.4.25 february 18, 2004 Linux 2.6.6 may 9, 2004 MirBSD #8-beta october 16, 2004 OpenBSD 3.6 october 29, 2004 Mac OS X 10.4 (Tiger beta 2) october 30, 2004 Mac OS X 10.3.6 november 5, 2004 NetBSD 2.0.3 october 31, 2005 DragonFly BSD 1.2.0 march 8, 2005 DesktopBSD 1.0-RC1 july 25, 2005 FreeBSD 6 FreeBSD 6 BETA 3 (announced) august 29, 2005 july 2, 2005 Darwin 7.9 april 15, 2005 Darwin 7.8 february 9, 2005 Mac OS X 10.3.8 february 9, 2005 Mac OS X 10.3.7 december 15, 2004 Mac OS X 10.3.7 Server december 15, 2004 Mac OS X 10.3.9 april 15, 2005 Mac OS X 10.3.8 Server february 9, 2005 Mac OS X 10.3.9 Server april 15, 2005 Darwin 8.0.1 april 29, 2005 Mac OS X 10.4 april 29, 2005 Mac OS X 10.4.2 july 12, 2005 Mac OS X 10.4.1 Server may 19, 2005 Unicos/mp 3.0 march 2005 IRIX 6.5.27 february 2, 2005 OpenServer 6 (Legend beta) february 23, 2005 Gnuppix GNU/Hurd-L4 0503 march 1, 2005 OpenServer 6 june 22, 2005 OpenSolaris (announced) june 14, 2005 Linux 2.6.12 june 17, 2005 Linux 2.4.30 april 3, 2005 PC-BSD1.1 may 28, 2006 PC-BSD 1.11 june 19, 2006 FreeBSD 5.5 may 25, 2006 OpenBSD 3.9 may 1, 2006 GNU-Darwin 1.1 Opteron may 26, 2006 AIX 5L v5.2 ML 2 october 2003 PC-BSD 1.3 december 31, 2006 Mac OS X 10.4.6 Server april 3, 2006 AIX 5L v5.2 ML 3 may 2004 AIX 5L v5.2 ML 4 december 2004 AIX 5L v5.3 (announced) july 13, 2004 AIX 5L v5.3.0 august 30, 2004 AIX 5L v5.2 ML 5 january 2005 AIX 5L v5.3 Maintenance Level 1 january 2005 AIX 5L v5.2 ML 6 may 2005 AIX 5L v5.3 ML 2 may 2005 Minix 3 V3.0, V3.1, V3.1.1 october 24, 2005 AIX 5L v5.1 ML 9 september 2005 2008 AIX 5L v5.2 ML 7 september 2005 AIX 5L v5.3 ML 3 september 2005 NetBSD 4.0 RC3 october 19, 2007 MidnightBSD 0.1 august 5, 2007 DragonFly BSD 1.10.0 august 6, 2007 DesktopBSD 1.6-RC3 july 25, 2007 DragonFly BSD 1.8.1 march 27, 2007 DragonFly BSD 1.10.1 august 21, 2007 Plan 9 20060522 may 21, 2006 PC-BSD 1.4 september 24, 2007 DesktopBSD 1.6 january 9, 2008 FreeBSD 6.3 january 18, 2008 OpenBSD 4.1 may 1, 2007 Darwin 8.9 april 17, 2007 iPhone OS 1.0 june 29, 2007 iPhone OS 1.1.1 sept. 27, 2007 iPhone OS 1.0.2 august 21, 2007 AppleTV 1.1 june 20, 2007 Darwin 8.10 october 30, 2007 Darwin 9.0 october 30, 2007 Mac OS X 10.4.10 june 20, 2007 Mac OS X 10.4.9 march 13, 2007 iPod OS 1.1 sept. 13, 2007 Mac OS X 10.5 (Leopard) october 26, 2007 Mac OS X 10.4.10 Server june 20, 2007 Mac OS X 10.4.9 Server march 13, 2007 Mac OS X 10.4.8 Server september 30, 2006 Solaris 10 11/06 december 12, 2006 Solaris 10 update 4 8/07 september 4, 2007 Mac OS X 10.5 Server october 26, 2007 iPhone OS 1.1.3 january 15, 2008 Linux 2.6.21 april 26, 2007 Linux 2.6.22 july 8, 2007 AIX 5L v5.3 TL5 august 2006 iPhone OS 2.0.1 august 4, 2008 AppleTV 2.1 july 10, 2008 Darwin 9.3 june 19, 2008 Solaris 10 update 5 05/08 april 17, 2008 Mac OS X 10.5.4 june 30, 2008 Mac OS X 10.5.4 Server june 30, 2008 iPhone OS 2.1 september 12, 2008 Linux 2.4.35.2 september 8, 2007 iPhone OS 2.2 november 21, 2008 AppleTV 2.3 november 20, 2008 Darwin 9.5 september 17, 2008 OpenSolaris 2008.05 (Indiana) may 5, 2008 SCO UnixWare 7.1.4 Maintenance Pack 4 june 11,2008 Linux 2.6.26 july 13, 2008 Linux 2.6.25 april 17, 2008 2010 Linux 2.6.28 december 24, 2008 Mac OS X 10.5.8 Server august 5, 2009 Darwin 10.1 september 15, 2009 Darwin 10.0 august 28, 2009 Mac OS X 10.6 (Snow Leopard) august 28, 2009 Mac OS X 10.6 Server august 28, 2009 iPhone OS 3.1.3 february 2, 2010 AppleTV 3.0.1 november 7, 2009 Darwin 10.2 november 13, 2009 OpenBSD 4.7 may 19, 2010 iPhone OS 3.2 april 3, 2010 AppleTV 3.0.2 february 10, 2010 Darwin 10.3 april 1, 2010 FreeBSD 8.1 july 23, 2010 iOS 4.0.1 july 15, 2010 iOS 4.0.2 august 11, 2010 iOS 3.2.1 iOS 3.2.2 july 15, 2010 august 11, 2010 QNX Neutrino RTOS 6.5.0 july 2010 iOS 4.1 sept. 8, 2010 Mac OS X 10.6.3 Server march 29, 2010 FreeBSD 8.2 february 24, 2011 FreeBSD 7.4 february 24, 2011 iOS 4.3 march 9, 2011 iOS 4.2.1 november 22, 2010 Darwin 10.5 november 17, 2010 Darwin 10.6 january 9, 2011 Mac OS X 10.6.5 Server Update 1.1 november 15, 2010 Mac OS X 10.6.6 Server january 6, 2011 Mac OS X 10.6.5 Server november 10, 2010 Mac OS X 10.6.4 Server Update 1.1 july 22, 2010 iOS 4.3.2 april 14, 2011 iOS 4.3.3 may 4, 2011 OpenBSD 4.9 may 1, 2011 Mac OS X 10.6.7 march 21, 2011 Mac OS X 10.7 beta (Lion) february 24, 2011 Mac OS X 10.6.7 Server march 21, 2011 FreeBSD 9.0 january 12, 2012 iOS 4.3.4 july 15, 2011 iOS 5 october 12, 2011 Darwin 11.1 august 25, 2011 Mac OS X 10.6.8 Server v1.1 july 25, 2011 OS X Lion 10.7.1 august 20, 2011 2013 NetBSD 6.0 october 17, 2012 NetBSD 6.0.1 december 27, 2012 PC-BSD 9.1 december 18, 2012 DragonFly BSD 3.0.1 february 22, 2012 DragonFly BSD 3.0.2 march 26, 2012 DragonFly BSD 3.0.3 august 22, 2012 DragonFly BSD 3.2.1 november 2, 2012 FreeBSD 9.1 november 12, 2012 FreeBSD 8.3 avril 18, 2012 iOS 5.1.1 may 7, 2012 OpenBSD 5.1 may 1, 2012 iOS 6.0 beta june 11, 2012 iOS 6.0 september 19, 2012 iOS 6.0.1 november 1, 2012 Darwin 12 august 1, 2012 Darwin 12.1 august 28, 2012 OS X Lion 10.7.4 may 9, 2012 OS X Lion 10.7.3 february 1, 2012 Android 2.2 Froyo may 20, 2010 Android 2.1 january 12, 2010 Linux 2.6.32 december 2, 2009 Linux 2.6.33 february 24, 2010 Android 2.3 Gingerbread december 6, 2010 Linux 2.6.34 may 16, 2010 Linux 2.6.35 august 1, 2010 Linux 2.6.36 october 20, 2010 Linux 2.6.37 january 4, 2011 OS X Mountain Lion 10.8.1 august 23, 2012 OS X Lion 10.7.5 update 1.0 october 4, 2012 OS X Mountain Lion 10.8.2 september 19, 2012 OS X Mountain Lion 10.8.2 update 1.0 october 4, 2012 BlackBerry 10 (announced) may 1, 2012 Oracle Solaris 11.1 october 4, 2012 Oracle Solaris 11 november 9, 2011 GNU/Hurd 0.401 april 1, 2011 Android 3.0 Honeycomb february 22, 2011 Android 3.1 may 10, 2011 Linux 2.6.38 march 15, 2011 OpenIndiana build 151a september 14, 2011 Android 4.0.1 Ice Cream Sandwich october 19, 2011 Android 3.2 july 15, 2011 Android 4.0.3 december 16, 2011 Android 4.1.1 Jelly Bean july 9, 2012 Android 4.0.4 march 29, 2012 Android 4.1.2 oct. 9, 2012 Android 4.2 oct. 29, 2012 Android 4.2.1 november 27, 2012 Linux 2.6.39.4 august 3, 2011 Linux 2.6.39 may 18, 2011 Linux 3.0 july 22, 2011 Linux 2.4.37.11 december 18, 2010 z/OS Unix V1R12 september 24, 2010 HP-UX 11.31/11iv3 Update 6 (1003) march 2010 HP-UX 11.31/11iv3 Update 5 (0909) september 2009 Minix 3 V3.1.4 (4203) march 26, 2009 AIX 6.1 TL1 may 30, 2008 AIX 6.1 november 9, 2007 AIX 6.1 TL2 november 2008 Minix 3 V3.1.5 november 5, 2009 AIX 6.1 TL4 november 2009 AIX 6.1 TL3 may 2009 AIX 5L v5.2 TL10 SP8 july 1, 2009 AIX 5L v5.3 TL8 april 2008 AIX 5L v5.3 TL9 november 2008 AIX 5L v5.3 TL10 may 2009 DragonFly BSD 3.2.2 december 16, 2012 Darwin 12.2 october 1, 2012 OS X Lion 10.7.5 september 19, 2012 OS X Mountain Lion 10.8 july 21, 2012 OS X Mountain Lion 10.8 beta february 16, 2012 BlackBerry BBX (announced) october 18, 2011 Solaris 11 Express 2010.11 november 15, 2010 OpenIndiana build 148 december 17, 2010 OpenSolaris 2010.03 march 2010 Android 2.0 Eclair october 26, 2009 Linux 2.6.31 september 9, 2009 z/OS Unix V1R11 september 25, 2009 HP-UX 11.31/11iv3 Update 4 (0903) april 9, 2009 iOS 6.0.2 december 18, 2012 OpenBSD 5.2 november 1, 2012 QNX Neutrino RTOS 6.5 SP1 july 11, 2012 Darwin 11.4 may 18, 2012 Darwin 11.3 february 6, 2012 OS X Lion 10.7.2 october 12, 2011 Oracle Solaris 10 8/11 september 15, 2011 Tru64 Unix V5.1B-6 october 2010 OpenIndiana build 147 september 14, 2010 Darwin 11.2 october 12, 2011 Mac OS X 10.6.8 v1.1 july 25, 2011 OS X Lion 10.7 july 20, 2011 Solaris 10 update 9 09/10 september 8, 2010 Solaris 10 update 8 10/09 october 8, 2009 Debian GNU/Hurd L1 october 19, 2009 iOS 5.1 march 7, 2012 OpenBSD 5.0 november 1, 2011 Darwin 11.0 july 20, 2011 Mac OS X 10.6.8 june 23, 2011 Mac OS X 10.6.8 Server june 23, 2011 iOS 4.3.5 july 25, 2011 OpenBSD 5.0 beta july 18, 2011 Darwin 10.8 june 27, 2011 Darwin 10.7 march 28, 2011 Mac OS X 10.6.6 january 6, 2011 Mac OS X 10.6.5 november 10, 2010 Mac OS X 10.6.4 Server june 15, 2010 iOS 4.3.1 march 25, 2011 OpenBSD 4.8 november 1, 2010 Darwin 10.4 june 17, 2010 Mac OS X 10.6.4 june 15, 2010 Mac OS X 10.6.3 march 29, 2010 Mac OS X 10.6.2 november 9, 2009 Mac OS X 10.6.2 Server november 9, 2009 2012 NetBSD 5.1.2 february 11, 2012 PC-BSD 9.0 january 13, 2012 PC-BSD 8.2 february 24, 2011 DragonFly BSD 2.10.1 april 26, 2011 FreeBSD 8.1 beta 1 may 29, 2010 FreeBSD 7.3 march 23, 2010 AppleTV 3.0 october 29, 2009 Mac OS X 10.6.1 september 10, 2009 Mac OS X 10.6.1 Server september 10, 2009 Android 1.6 Donut september 15, 2009 Linux 2.6.30 june 9, 2009 MidnightBSD 0.3 january 28, 2011 DragonFly BSD 2.6.1 april 6, 2010 FreeBSD 8.0 november 26, 2009 iOS 4.0 june 21, 2010 Mac OS X 10.5.8 august 5, 2009 OpenSolaris 2009.06 june 1, 2009 Android 1.5 Cupcake april 30, 2009 Linux 2.6.29 march 23, 2009 Linux 2.4.37 december 2, 2008 z/OS Unix V1R10 september 26, 2008 HP-UX 11.31/11iv3 Update 3 (0809) september 2008 HP-UX 11.31/11iv3 Update 2 (0803) march 2008 AIX 5L v5.3 TL7 november 2007 NetBSD 5.1 november 19, 2010 PC-BSD 8.1 july 20, 2010 PC-BSD 8.0 february 22, 2010 Darwin 9.8 august 10, 2009 Mac OS X 10.5.7 may 12, 2009 Mac OS X 10.5.7 Server may 12, 2009 Solaris 10 update 7 05/09 april 30, 2009 OpenBSD 4.6 october 18, 2009 iPhone OS 3.1.2 october 8, 2009 iPhone OS 3.1 september 9, 2009 iPhone OS 3.0.1 july 31, 2009 AppleTV 2.4 june 24, 2009 Darwin 9.7 may 14, 2009 Android 1.1 february 9, 2009 Linux 2.6.27 october 9, 2008 2011 NetBSD 5.0.2 february 12, 2010 FreeBSD 7.2 may 4, 2009 OpenBSD 4.5 may 1, 2009 iPhone OS 3.0 june 17, 2009 QNX Neutrino RTOS 6.4.1 may 2009 Tru64 Unix V5.1B-5 march 2009 OpenSolaris 2008.11 december 1, 2008 Android 1.0 september 23, 2008 Linux 3.1 october 24, 2011 Linux 3.2 january 4, 2012 Linux 3.3 march 18, 2012 Linux 3.4 may 20, 2012 Linux 3.5 july 21, 2012 Linux 3.6 september 30, 2012 Linux 3.7 december 10, 2012 z/OS Unix V1R13 september 30, 2011 HP-UX 11.23/11iv2/0806 june 2008 HP-UX 11.23/11iv2/0712 december 2007 HP-UX 11.31/11iv3 Update 1 (0709) september 2007 AIX 5L v5.3 TL6 june 2007 AppleTV 2.3.1 february 25, 2009 Mac OS X 10.5.6 Server december 15, 2008 Solaris 10 update 6 10/08 october 31, 2008 DragonFly BSD 2.4 september 16, 2009 DesktopBSD 1.7 september 7, 2009 FreeBSD 8.0 RC1 september 21, 2009 FreeBSD 8.0 beta 1 july 7, 2009 iPhone OS 2.2.1 january 27, 2009 Darwin 9.6 december 18, 2008 Mac OS X 10.5.6 december 15, 2008 Mac OS X 10.5.5 september 15, 2008 Mac OS X 10.5.5 Server september 15, 2008 OpenServer 6 Maintenance Pack 4 february 9, 2009 OpenSolaris (build 86) march 4, 2008 Linux 2.6.24 january 24, 2008 Linux 2.4.36 january 1, 2008 Linux 2.4.35.3 september 23, 2007 z/OS Unix V1R9 august 7, 2007 Minix 3 V3.1.3a june 8, 2007 AIX 6 open beta july 11, 2007 AIX 5L v5.2 TL10 june 2007 DragonFly BSD 2.2.1 april 26, 2009 FreeBSD 7.1 january 5, 2009 2.11BSD patch 447 2.11BSD patch 446 december 27, 2008 december 31, 2008 PureDarwin Xmas december 25, 2008 OpenBSD 4.4 october 31, 2008 QNX Neutrino RTOS 6.4.0 october 30, 2008 AppleTV 2.2 october 2, 2008 Darwin 9.4 july 18, 2008 Mac OS X 10.5.3 may 28, 2008 Mac OS X 10.5.3 Server may 29, 2008 Debian GNU/Hurd K16 december 21, 2007 Android beta november 5, 2007 Linux 2.6.23 october 9, 2007 Linux 2.4.35 july 26, 2007 HP-UX 11.23/11iv2/0706 june 2007 Minix 3 V3.1.3 april 13, 2007 AIX 5L v5.2 TL9 august 2006 iPhone OS 2.0 july 11, 2008 AppleTV 2.0.2 april 14, 2008 Mac OS X 10.5.2 february 11, 2008 Debian GNU/Hurd K15 november 19, 2007 Linux 2.6.20 february 4, 2007 HP-UX 11.11/11iv1/0612 december 2006 HP-UX 11.23/11iv2/0609 september 2006 PC-BSD 7.1.1 july 6, 2009 PC-BSD 7.1 april 11, 2009 DragonFly BSD 2.2 february 17, 2009 FreeBSD 6.4 november 28, 2008 AppleTV 2.0.1 march 28, 2008 Darwin 9.2 february 13, 2008 Mac OS X 10.5.2 Server february 11, 2008 OpenServer 6 Maintenance Pack 3 november 2, 2007 OpenSolaris (build 78) october 29, 2007 Solaris 11 beta Nevada build 74 october 9, 2007 Linux 2.6.19 november 29, 2006 Linux 2.4.34 december 23, 2006 z/OS Unix V1R8 september 29, 2006 NetBSD 5.0.1 august 2, 2009 NetBSD 5.0 april 29, 2009 NetBSD 4.0.1 october 14, 2008 DragonFly BSD 2.0 july 20, 2008 OpenBSD 4.3 may 1, 2008 iPhone OS 1.1.4 february 26, 2008 AppleTV 2.0 feb. 12, 2008 Darwin 8.11 november 14, 2008 Darwin 9.1 november 15, 2007 Mac OS X 10.4.11 november 14, 2007 Mac OS X 10.5.1 november 15, 2007 Mac OS X 10.4.11 Server november 14, 2007 Mac OS X 10.5.1 Server november 15, 2007 2009 MidnightBSD 0.2.1 august 30, 2008 PC-BSD 7 september 16, 2008 PC-BSD 1.5.1 april 23, 2008 DragonFly BSD 1.12.2 april 20, 2008 FreeBSD 7.0 february 27, 2008 MirBSD #10 march 16, 2008 OpenBSD 4.2 november 1, 2007 iPhone OS 1.1.2 november 1, 2007 Tru64 Unix V5.1B-4 december 2006 Debian GNU/Hurd K14 november 27, 2006 OpenSolaris (build 52) october 19, 2006 Linux 2.6.18 september 20, 2006 Linux 2.4.33 august 11, 2006 HP-UX 11.11/11iv1/0606 june2006 HP-UX 11.23/11iv2/0606 june 2006 Minix 3 V3.1.2a may 29, 2006 AIX 5L v5.2 Technology Level 8 february 2006 AIX 5L v5.3 Technology Level 4 february 2006 PC-BSD 1.5 march 12, 2008 DragonFly BSD 1.12 february 26, 2008 PureDarwin 2007 AppleTV 1.0 march 21, 2007 Darwin 8.8 november 8, 2006 Mac OS X 10.4.8 september 30, 2006 IRIX 6.5.30 august 16, 2006 Plan 9 20060628 june 28, 2006 Linux 2.6.17 june 18, 2006 Linux 2.6.16 march 20, 2006 HP-UX 11.23/11iv2/0603 march 2006 NetBSD 4.0 december 19, 2007 MidnightBSD 0.1.1 november 4, 2007 PC-BSD 1.4.1 november 16, 2007 FreeBSD 6.2 january 15, 2007 2.11BSD patch 445 december 26, 2006 OpenBSD 4.0 november 1, 2006 Darwin 8.7 august 16, 2006 Mac OS X 10.4.7 Server june 27, 2006 Solaris 10 6/06 june 26, 2006 OpenSolaris (build 38) march 28, 2006 Linux 2.6.15 january 2, 2006 Linux 2.4.32 november 16, 2005 HP-UX 11.23/11iv2/0512 december 2005 PC-BSD 1.3.01 january 6, 2007 DragonFly BSD 1.8.0 january 30, 2007 DragonFly BSD 1.6.0 july 24, 2006 MirBSD #9 june 25, 2006 Mac OS X 10.4.7 june 27, 2006 Mac OS X 10.5 (Leopard, beta) august 7, 2006 Mac OS X 10.4.5 Server february 15, 2006 Unicos/mp 3.1 april 2006 IRIX 6.5.29 february 8, 2006 OpenServer 6 Maintenance Pack 2 march 7, 2006 Debian GNU/Hurd K11 april 26, 2006 SCO UnixWare 7.1.4 Maintenance Pack 3 january 2006 2007 NetBSD 3.1 november 4, 2006 NetBSD 3.0.2 november 4, 2006 PC-BSD 1.2 july 12, 2006 FreeBSD 6.1 may 8, 2006 Mac OS X 10.4.6 april 3, 2006 Mac OS X 10.4.5 february 15, 2006 Solaris 10 1/06 january 25, 2006 Linux 2.6.14 october 27, 2005 z/OS Unix V1R7 september 30, 2005 HP-UX 11.11/11iv1/0509 september 2005 HP-UX 11.31 aka 11iv3 february 2007 Minix 2.0.4 november 23, 2003 AIX 4.3.3 Maintenance Level 11 february 2003 PC-BSD 1.0 april 28, 2006 DragonFly BSD 1.4.4 april 23, 2006 FreeDarwin PR1 march 16, 2006 Darwin 8.5 Darwin 8.6 february 15, 2006 april 10, 2006 Darwin 8.4 jan. 10, 2006 Mac OS X 10.4.4 Server january 10, 2006 Debian GNU/Hurd K10 october 26, 2005 OpenSolaris (build 21) july 26, 2005 Linux 2.6.13 august 28, 2005 Linux 2.4.31 may 31, 2005 HP-UX 11.23/11iv2/0505 may 2005 DesktopBSD 1.0 march 28, 2006 MirBSD #8 december 23, 2005 Mac OS X 10.4.4 january 10, 2006 Mac OS X 10.4.3 Server october 31, 2005 IRIX 6.5.28 august 3, 2005 OpenServer 6 Maintenance Pack 1 august 4, 2005 Solaris 11 beta Nevada build 23 october 18, 2005 Linux 2.6.11 march 2, 2005 PC-BSD 1.0rc2 january 20, 2006 DragonFly BSD 1.4 january 8, 2006 FreeBSD 6.0 november 4, 2005 Darwin 8.3 october 31, 2005 Mac OS X 10.4.3 october 31, 2005 Mac OS X 10.4.2 Server july 12, 2005 Tru64 Unix V5.1B-3 june 2, 2005 Debian GNU/Hurd K9 may 13, 2005 SCO UnixWare 7.1.4 Maintenance Pack 2 february 2005 Linux 2.4.29 january 19, 2005 HP-UX 11.11/11iv1/0412 december 2004 NetBSD 3.0.1 july 24, 2006 NetBSD 3.0 december 23, 2005 PC-BSD 1.0rc1 november 10, 2005 DesktopBSD 1.0-RC3 november 26, 2005 OpenBSD 3.8 november 1, 2005 Darwin 8.2 july 12, 2005 Darwin 8.1 may 16, 2005 Mac OS X 10.4.1 may 16, 2005 Mac OS X 10.4 Server april 29, 2005 Solaris 9 OE 9/05 september 3, 2005 Solaris 10 january 31, 2005 Debian GNU/Hurd K8 december 30, 2004 Linux 2.6.10 december 24, 2004 Linux 2.4.28 november 17, 2004 HP-UX 11.23/11iv2/0409 (IA/PA) september 2004 Linux 2.0.40 february 8, 2004 Linux 2.2.26 february 24, 2004 AIX 5L v5.2 Maintenance Level 1 may 2003 AIX 5L v5.2 october 18, 2002 NetBSD 2.1 november 2, 2005 PC-BSD 0.8.3 october 23, 2005 OpenBSD 3.7 may 19, 2005 Darwin 7.7 december 15, 2004 Darwin 7.6 november 6, 2004 Solaris 10 (announced) november 15, 2004 Unicos/mp 2.5 november 2004 IRIX 6.5.26 november 3, 2004 Linux 2.6.9 october 18, 2004 z/OS Unix V1R6 september 24, 2004 DesktopBSD 1.0-RC2 october 8, 2005 FreeBSD 5.4 may 9, 2005 FreeBSD 5.3 november 6, 2004 Mac OS X 10.3.6 Server november 5, 2004 Debian GNU/Hurd K7 september 22, 2004 Linux 2.6.8.1 august 14, 2004 HP-UX 11.11/11iv1/0406 june 2004 HP-UX 11.23/11iv2/0403 march 2004 PC-BSD 0.7.8 july 18, 2005 PC-BSD 0.7 may 18, 2005 GNU-Darwin 1.1 rc2 september 29, 2004 Darwin 8.0b1 september 2004 Solaris 9 OE 9/04 august 16, 2004 IRIX 6.5.25 august 4, 2004 SCO UnixWare 7.1.4 Maintenance Pack 1 july 2004 Linux 2.6.8 august 13, 2004 Diamond SVR6 (announced) august 3, 2004 Linux 2.4.27 august 7, 2004 2006 NetBSD 2.0.2 april 15, 2005 NetBSD 2.0 december 9, 2004 FreeBSD 4.11 january 25, 2005 FreeBSD 5.3-BETA1 august 22, 2004 Darwin 7.5 august 10, 2004 Mac OS X 10.3.5 august 9, 2004 Mac OS X 10.3.5 Server august 9, 2004 OpenServer 5.0.7 Update Pack 3 july 9, 2004 Linux 2.6.7 june 15, 2004 Linux 2.4.26 april 14, 2004 z/OS, z/OS.e Unix V1R5 march 26, 2004 NetBSD 2.0 RC5 november 12, 2004 FireFly BSD 1.0 september 2004 Triance OS 1.0-BETA august 23, 2004 GNU-Darwin 1.1 rc1 august 17, 2004 OpenDarwin 7.2.1 july 16, 2004 Mac OS X 10.4 Server (Tiger beta) june 28, 2004 SCO UnixWare 7.1.4 june 15, 2004 Linux 2.6.0-test11 november 26, 2003 Linux 2.4.22 august 25, 2003 DragonFly BSD 1.0A july 15, 2004 MirBSD #7quater june 14, 2004 Darwin 7.4 may 26, 2004 Mac OS X 10.4 (Tiger beta) june 28, 2004 Solaris 9 OE 4/04 april 1, 2004 Tru64 Unix V5.1B-2 may 2004 Unicos/mp 2.4 march 2004 IRIX 6.5.23 IRIX 6.5.24 february 4, 2004 may 5, 2004 NonStop-UX C63 february 6, 2004 OpenServer 5.0.7 Update Pack 2 february 18, 2004 Debian GNU/Hurd K6 may 9, 2004 Debian GNU/Hurd K5 november 24, 2003 Linux 2.6.0-test1 july 13, 2003 HP-UX 11.11/11iv1 0306 june 2003 Linux 2.5.75 july 10, 2003 2005 ekkoBSD 1.0 BETA 2 july 7, 2004 DragonFly DragonFly BSD 1.0-RC1 BSD 1.0 june 28, 2004 july 12, 2004 FreeBSD 4.10 may 27, 2004 FreeBSD 5.2.1 february 25, 2004 OpenBSD 3.5 may 1, 2004 Darwin 7.1 Mac OS X 10.3.1 november 10, 2003 Mac OS X 10.3.1 Server november 10, 2003 IRIX 6.5.22 november 5, 2003 SCO UnixWare 7.1.3 /OKP july 31, 2003 Linux 2.5.70 may 26, 2003 NetBSD 2.0 RC1 september 27, 2004 Silver OS july 10, 2004 NetBSD 1.6.2 february 29, 2004 ekkoBSD BETA 2 february 18, 2004 DragonFly BSD (beta) march 5, 2004 4.3BSD-Quasijarus0c february 15, 2004 FreeBSD 5.2-RC1 december 10, 2003 QNX 6.3 june 3, 2004 Darwin 7.0.1 november 14, 2003 Solaris 9 OE 12/03 december 2003 Tru64 Unix V5.1B-1 october 20, 2003 Unicos/mp 2.3 october 2003 OpenServer 5.0.7 Update Pack 1 july 31, 2003 Debian GNU/Hurd K4 july 29, 2003 Linux 2.5.68 april 19, 2003 Linux 2.2.25 march 17, 2003 Minix 2.0.3 may 22, 2001 Monterey beta AIX 4.3.3 september 17, 1999 FreeBSD 5.2-BETA november 26, 2003 MirBSD #7ter november 22, 2003 MirBSD #7bis october 4, 2003 OpenBSD 3.4 november 1, 2003 GNU-Darwin 1.1 october 8, 2003 MirBSD #7semel september 28, 2003 OpenBSD 3.4 beta august 11, 2003 Solaris 9 OE 8/03 july 29, 2003 Solaris 10 Preview july 29, 2003 Unicos/mp 2.2 july 2003 Linux 2.4.21 june 13, 2003 HP-UX 11.11/11iv1/0303 march 2003 HP-UX 11.11/11iv1/0212 december 2002 Linux 2.5.48 november 18, 2002 Mac OS X 10.2.7 august 18, 2003 Mac OS X Server 10.3 beta (Panther) june 23, 2003 SCO UnixWare 7.1.3 Update Pack 1 may 8, 2003 Linux 2.4.20 november 28, 2002 z/OS, z/OS.e Unix V1R4 september 27, 2002 HP-UX 11.11/11iv1/0209 september 2002 Linux 2.2.22 sept. 16, 2002 MirBSD #6 july 8, 2003 OpenDarwin 6.6.1 may 27, 2003 IRIX 6.5.20 may 7, 2003 Debian GNU/Hurd K3 april 30, 2003 SCO UnixWare 7.1.3 december 4, 2002 Unicos 10.0.1.2 may 2003 MkLinux Pre-R2 august 5, 2002 Linux 2.5.30 august 1, 2002 HP-UX 11.22 aka 11iv1.6 (IA) august 2002 Linux 2.2.21 may 20, 2002 4.3BSD-Quasijarus0b december 7, 2003 MicroBSD 0.7 beta october 27, 2003 FreeBSD 5.1 june 9, 2003 MirBSD #5 june 11, 2003 Darwin 6.6 may 14, 2003 Mac OS X Server 10.2.6 may 8, 2003 Solaris 9 OE 4/03 april 2003 Tru64 Unix V5.1B january 20, 2003 IRIX 6.5.18 november 8, 2002 SCO UnixWare 7.1.3 (announced) august 26, 2002 Linux 2.4.19 august 3, 2002 z/OS, z/OS.e Unix V1R3 march 29, 2002 HP-UX 11.11/11iv1/0203 march 2002 Linux 2.5.5 february 19, 2002 ekkoBSD 1.0 BETA1B november 25, 2003 FreeBSD 4.9 october 28, 2003 2.11BSD patch 444 february 10, 2003 FreeBSD 5.0 FreeBSD 5.0 DP 2 january 19, 2003 november 18, 2002 MirBSD #4 MirBSD #1 MirBSD #3 MirBSD #2 april 16, 2003 november 31, 2002 march 2, 2003 january 28, 2003 OpenBSD 3.2 OpenBSD 3.3 november 1, 2002 may 1, 2003 GNU-Darwin 1.0 january 10, 2003 QNX 6.2.1 (Momentics) february 18, 2003 OpenDarwin-20030212 february 17, 2003 Darwin Darwin 6.0.2 Darwin 6.5 Darwin 6.2 Darwin 6.4 Darwin 6.3 6.1 oct. 28, 2002 april 15, 2003 Mac OS X Mac OS X Mac OS X 10.2.3 Mac OS X 10.2.4 Mac OS X 10.2.2 10.2.5 10.2.6 december 19, 2002 february 13, 2003 november 11, 2002 april 10, 2003 may 6, 2003 Solaris 8 12/02 december 2002 Mac OS X Server 10.2 august 13, 2002 IRIX 6.5.17 august 7, 2002 Debian GNU/Hurd J1 august 5, 2002 Open UNIX 8 MP4 Release 8.0 july 3, 2002 Linux 2.4.18 february 25, 2002 Linux 2.5.3 january 30, 2002 ekkoBSD august 6, 2003 FreeBSD 4.8 april 3, 2003 FreeBSD 4.7 october 10, 2002 MicroBSD 0.6 october 12, 2002 MicroBSD 0.5 august 14, 2002 QNX 6.2 (Momentics) june 4, 2002 Mac OS X 10.1.5 june 4, 2002 Mac OS X Server 10.1.5 july 1, 2002 Solaris 9 OE may 22, 2002 Yamit (alpha) may 5, 2002 IRIX 6.5.16 may 8, 2002 NonStop-UX C60 may 3, 2002 GNU (GNU/Hurd, GNU Mach 1.3) Plan 9 r4 may 27, 2002 april 28, 2002 Open UNIX 8 MP3 Release 8.0 february 12, 2002 Unicos 10.0.1.1 may 2002 Linux 2.4.7 july 20, 2001 z/OS Unix System Services V1R1 march 30, 2001 HP-UX 11.11/11iv1/0106 june 2001 Linux 2.2.19 march 25, 2001 OpenBSD 3.1 may 19, 2002 Darwin 5.4 Mac OS X Server 10.1.4 april 15, 2002 Tru64 Unix V5.1A september 2001 IRIX 6.5.12 may 9, 2001 Unicos 10.0.1.0 june 2001 Linux 2.4.0 january 4, 2001 HP-UX 11.11 aka 11iv1 december 2000 Linux 2.0.39 january 9, 2001 Linux 2.2.13 october 19, 1999 Mac OS X 10.1.4 april 17, 2002 Solaris 8 2/02 february 2002 Mac OS X Server 10.1.2 january 17, 2002 Solaris 9 EA october 2, 2001 Solaris 9 alpha IRIX 6.5.11 february 2, 2001 UnixWare 7.1.1 DCFS november 27, 2000 Unicos 10.0.0.8 november 22, 2000 Linux 2.4.0 test12 december 12, 2000 Security-Enhanced Linux 1.0 december 22, 2000 Linux 2.0.38 august 25, 1999 NetBSD 1.6 sept. 14, 2002 FreeBSD 4.6 june 15, 2002 MicroBSD 0.1 july 14, 2002 FreeBSD 5.0 Developer Preview 1 april 8, 2002 Darwin 1.4.1 october 1, 2001 Solaris 8 7/01 july 2001 Mac OS X Server 10.0.4 july 3, 2001 GNU-Darwin (beta 2.5) march 12, 2002 QNX RTOS 6.1.0 patch A september 28, 2001 Mac OS X 10.1 (Puma) sept. 29, 2001 Mac OS X 10.0.4 june 22, 2001 Solaris 8 4/01 may 2001 Mac OS X Server 10.0.3 may 21, 2001 IRIX 6.5.10 november 8, 2000 Dynix/ptx 4.5.3 october 2001 Unicos 10.0.0.7 january 2000 OS/390 Unix V2R8 september 24, 1999 Linux 2.3.0 may 11, 1999 NetBSD 1.6 beta may 28, 2002 NetBSD 1.5.3 july 22, 2002 FreeBSD 4.5 january 29, 2002 OpenBSD 3.0 november 27, 2001 QNX RTOS 6.1.0 Darwin 1.3.1 april 13, 2001 ReliantUnix 5.45 2000 IRIX 6.5.9 august 9, 2000 OpenServer 5.0.6 august 21, 2000 Debian GNU/Hurd A1 august 2000 UnixWare NSC 7.1.1+IP june 26, 2000 UnixWare 7.1.1 december 30, 1999 Unicos 10.0.0.6 june 1999 Linux 2.4.0 test 1 may 25, 2000 OS/390 Unix V2R6 september 25, 1998 Mk Linux DR3 july 31, 1998 Linux 2.0.36 november 15, 1998 Linux 2.1.32 april 5, 1997 AIX 4.2.1 april 25, 1997 BSD/OS 5.0 beta Solaris 8 1/01 (su3) february 20, 2001 Trusted Solaris 8 november 20, 2000 IRIX 6.5.7 february 10, 2000 NetBSD 1.5.2 september 14, 2001 OpenBSD 2.9 june 1, 2001 xMach current march 16, 2001 Mac OS X 10.0 (Cheetah) march 24, 2001 Solaris 8 10/00 (su2) october 2000 Mac OS X Server 1.2v3 october 27, 2000 Tru64 Unix V4.0G may 2000 IRIX 6.5.6 november 10, 1999 IRIX 6.5.5 august 6, 1999 GNU-Darwin january 17, 2001 QNX RTOS 6 january 18, 2001 Darwin 1.2.1 november 15, 2000 Mac OS X (beta) september 13, 2000 Solaris 8 6/00 (su1) june 2000 Tru64 Unix V5.0 august 12, 1999 IRIX 6.5.4 may 11, 1999 OpenServer 5.0.5a february 1999 Plan 9 r3 june 7, 2000 Unicos/mk 2.0 october 13, 1997 Unicos 10.0 november 19, 1997 OS/390 Unix V2R4 september 26, 1997 HP-UX 11.0 november 1997 Linux 2.0.28 january 14, 1997 2002 BSD/OS 4.3 february 14, 2002 NetBSD 1.5.1 july 11, 2001 2.11BSD patch 433 november 5, 2000 TrustedBSD beta OpenBSD 2.8 december 1, 2000 xMach DR 01 august 6, 2000 Darwin 1.1 may 15, 2000 Mac OS X (DP4) may 15, 2000 Mac OS X (DP3) february 14, 2000 Solaris 8 january 26, 2000 Mac OS X Server 1.2 january 14, 2000 Mac OS X Server 1.0.2 july 22, 1999 Trusted Solaris 7 november 2, 1999 IRIX 6.5.2 november 17, 1998 NonStop-UX C51 december 8, 1998 FreeBSD 4.3 april 22, 2001 FreeBSD 4.1.1 september 27, 2000 FreeBSD 5.0 beta march 2000 TrustedBSD (announced) april 9, 2000 OpenBSD 2.7 june 15, 2000 Darwin 1.0 april 5, 2000 Mac OS X (DP2) november 10, 1999 Solaris 7, 11/99 november 1999 FreeBSD 4.2 november 21, 2000 FreeBSD 4.1 july 27, 2000 FreeBSD 4.0 march 14, 2000 OpenBSD 2.6 december 1, 1999 QNX/Neutrino 2.10 (QRTP) Darwin 0.3 august 16, 1999 Solaris 7, 8/99 august 1999 Solaris 7, 5/99 may 1999 Mac OS X Server 1.0 march 16, 1999 Tru64 Unix V4.0F february 1, 1999 NonStop-UX C41 november 14, 1997 2001 NetBSD 1.4.3 november 25, 2000 NetBSD 1.5 december 6, 2000 FreeBSD 3.4 FreeBSD 3.3 december 20, 1999 september 17, 1999 4.3BSD-Quasijarus0a october 10, 1999 2.11BSD patch 430 december 13, 1999 FreeBSD 3.2 may 18, 1999 Darwin 0.2 may 13, 1999 Mac OS X (DP1) may 10, 1999 Solaris 7, 3/99 march 1999 Trusted Solaris 2.5.1 september 1998 Digital Unix 4.0D december 1997 NonStop-UX C40 august 20, 1997 OpenServer 5.0.4 may 1997 GNU 0.2 (GNU/Hurd) june 12, 1997 Unicos 9.3 august 1997 OS/390 OpenEdition V1R3 march 28, 1997 HP-UX 10.30 july 1997 Mk Linux DR2.1 BSD/OS 4.2 (BSDI) november 29, 2000 NetBSD 1.4.2 march 19, 2000 NetBSD 1.4.1 august 26, 1999 NetBSD 1.4 may 12, 1999 FreeBSD 3.1 february 15, 1999 Darwin 0.1 march 16, 1999 Solaris 7 (SunOS 5.7) october 27, 1998 Rhapsody DR2 may, 1998 Rhapsody DR1 september, 1997 ReliantUnix 5.44 1997 Unicos/mk 1.6 july 21, 1997 Unicos/mk 1.4.1 march 3, 1997 Unicos 9.2 january 13, 1997 OS/390 OpenEdition V1R1 OS/390 OpenEdition V1R2 march 29, 1996 september 27, 1996 HP-UX 10.10 HP-UX 10.20 december 1995 june 1996 Mk Linux DR2 Mk Linux DR1 december 1996 1996 Linux 2.0.21 Linux 2.0 september june 9, 1996 20, 1996 Linux 1.3.100 Linux 2.1 may 10, 1996 september 30, 1996 Minix 1.7.2 march 1996 AIX 4.2 may 17, 1996 1999 BSD/OS 4.0.1 (BSDI) march 1, 1999 NetBSD 1.3.3 december 23, 1998 Lites OPENSTEP 4.0 july 22, 1996 UnixWare 2.1 february 13, 1996 Linux 1.2.13 august 2, 1995 Linux 1.2 march 7, 1995 Linux 1.1.95 march 2, 1995 1998 xMach Solaris 2.6 (SunOS 5.6) august 1997 Solaris 2.5.1 (SunOS 5.5.1) may 1996 Mach 4 Mach 4 UK02p21 UK22 november 3, 1995 march 29, 1996 Digital Unix Digital Unix 4.0 Digital Unix 4.0A (DEC OSF/1 V4) 4.0B Sinix ReliantUnix 5.43 september 1996 may 1996 december 1996 1995 IRIX 6.2 IRIX 6.4 IRIX 6.3 march 1996 november 1996 september 1996 NonStop-UX Cxx february 1996 OpenServer 5.0.2 june 1996 Unicos-max 1.3 november 15, 1995 GNU 0.1 (GNU/Hurd) september 6, 1996 Unicos 9.0 september 21, 1995 Trusted Unicos 8.0 march 9, 1995 MVS/ESA OpenEdition SP5.2.1 MVS/ESA OpenEdition SP5.2.2 september 29, 1995 june 20, 1995 HP-UX 10.01 HP-UX 10.0 (S700/S800) may 1995 february 1995 HP-UX BLS 9.09+ december 1, 1994 4.4BSD Lite 2 OpenBSD 2.4 december 1, 1998 QNX/Neutrino 2.0 1998 QNX 4.25 QNX 4.24 Lites 1.1u3 march 30, 1996 Plan 9 r2 july 1995 Trusted IRIX/B 4.0.5 EPL february 6, 1995 UnixWare 2.0 Unix System V Release 4.2MP january 1995 MVS/ESA OE SP5.2.0 september 13, 1994 Linux 1.1.52 october 6, 1994 FreeBSD 3.0 october 16, 1998 4.3BSD-Quasijarus0 december 27, 1998 FreeBSD 2.2.8 november 29, 1998 FreeBSD 2.2.7 july 22, 1998 2.11BSD patch 400 january 1998 OpenBSD 2.2 december 1, 1997 QNX/Neutrino 1.0 1996 QNX 4.22 Digital Unix (DEC OSF/1 AXP) march 1995 IRIX 6.1 july 1995 NonStop-UX B32 june 12, 1995 OpenServer 5.0 may 9, 1995 Unicos-max 1.2 november 30, 1994 Chorus/MiX SVR4 Linux 1.0.6 april 3, 1994 HP-UX 9.05 april 19, 1994 Linux 1.1.0 april 6, 1994 AIX 3.2.5 october 15, 1993 NetBSD 1.3.1 march 9, 1998 NetBSD 1.3 january 4, 1998 FreeBSD 2.2 march 16, 1997 2.11BSD patch 300 february 1996 OpenBSD october 1995 Solaris 2.5 (SunOS 5.5) november 1995 NeXTSTEP 3.3 february 1995 Linux 1.0 march 14, 1994 BSD/OS 4.0 (BSDI) august 17, 1998 NetBSD 1.3.2 may 29, 1998 BSD/OS 3.1 (BSDI) december 10, 1997 NetBSD 1.2.1 may 20, 1997 NetBSD 1.2 october 4, 1996 NetBSD 1.1 november 26, 1995 FreeBSD 2.1 november 19, 1995 4.4BSD Lite 2 june 1995 QNX 4.2 AOS Lite 1995 Lites 1.1 march 24, 1995 Lites 1.0 february 28, 1995 Sinix 5.42 IRIX 6.0 december 1994 NonStop-UX B31 november 1, 1994 Unicos 8.0 march 11, 1994 UNIX Interactive 4.1a june 1994 MVS/ESA OpenEdition SP5.1.0 june 24, 1994 HP-UX 9.04 (S800) november 17, 1993 HP-UX 9.03 december 16, 1993 Venix 4.2 Coherent 4.2 may 1993 Lites IRIX 5.3 december 1994 Unicos-max 1.1 june 10, 1994 UnixWare 1.1.1 Unix System V Release 4.2 1994 Chorus/MiX SVR4 Xinu Linux 0.99.15j march 2, 1994 FreeBSD 2.0.5 june 10, 1995 ArchBSD november 1994 Solaris 2.4 (SunOS 5.4) december 1994 Mach 4 UK02 july 20, 1994 IRIX 5.2 march 1994 SCO UNIX 3.2.4 (Open Desktop) 1994 Unicos-max 1.0 november 15, 1993 Dell Unix SVR4 Issue 2.2.1 1993 Linux 0.99.11 july 18, 1993 SunOS 4.1.4 (Solaris 1.1.2) september 1994 OSF/1.3 june 1994 Sinix 5.41 1993 IRIX 5.1 september 1993 UnixWare 1.1 Unix SVR4.2 may 18, 1993 Dynix/ptx 2.0.4 1993 MVS/ESA OpenEdition SP4.3.0 march 26, 1993 HP-UX 9.02 august 1993 Minix 1.5 december 1992 AIX PS/2 1.3 october 2, 1992 FreeBSD 2.0 november 22, 1994 386 BSD 1.0 12 november 1994 2.11BSD patch 200 december 1994 HPBSD QNX 4.1 1994 SunOS 4.1.3_U1 (Solaris 1.1.1) december 1993 Solaris 2.3 (sparc) (SunOS 5.3) november 1993 NeXTSTEP 3.2 october 1993 Solaris 2.1 (x86) IRIX 5.0 march 1993 NonStop-UX B22 november 22, 1993 Trusted XENIX 4.0 september 17, 1993 HP-UX 9.0 (S800) october 7, 1992 Dell Unix SVR4 Issue 2.2 1992 HP-UX BLS 8.04 (S800) H2 1991 Linux 0.95 Linux 0.12 march 8, 1992 january 16, 1992 HP-UX 9,01 HP-UX 8.07 (S300/S700) (S300/S700) november 2, 1992 november 21, 1991 BSD/OS 2.0.1 (BSDI) august 1995 Ultrix 4.5 november 1995 FreeBSD 1.1.5.1 july 1994 4.4BSD Lite 1 march 1, 1994 4.4BSD Encumbered june 1993 Solaris 2.2 (sparc) (SunOS 5.2) may 1993 NeXTSTEP 3.1 may 25, 1993 ASV (final release) august 1992 Unicos 7.0 october 29, 1992 Unix System V Release 4.1ES december 1992 Microport Unix SVR4.1 HP-UX 8.06 (S800) H2 1991 1995 NetBSD 1.0 october 26, 1994 FreeBSD 1.1 may 1994 FreeBSD 1.0 december 1993 4.4BSD june 1, 1993 UnixWare 1 Unix System V Release 4.2 november 2, 1992 Chorus/MiX SVR4 1991 Unicos 6.0 february 14, 1991 Microport Unix SVR4.0 HP-UX 8.0 (S300/S800) march 27, 1991 Dell Unix SVR4.0 1990 AIX PS/2 & AIX/370 1.2.1 february 22, 1991 SunOS 4.1.3 (Solaris 1.1a) august 1992 Solaris 2.1 (SunOS 5.1) december 1992 NeXTSTEP 3.0 september 1992 Solaris 2.0 (x86) end 1992 Trusted XENIX 3.0 april 8, 1992 AMiX 2.2 Xinu HP-UX 7.08 (S800) H1 1991 Microport Unix SVR3.2 AIX PS/2 & AIX/370 1.2 march 30, 1990 Sinix 5.40 1992 IRIX 4.0.4 march 1992 IRIX 4.0 september 1991 GNU (GNU/Hurd) may 7, 1991 ASV (dev release) 1991 HP-UX 7.02 (S800) H2 1989 Linux 0.01 august 1, 1991 AIX PS/2 1.1 march 31, 1989 1994 BSD/OS 1.1 (BSDI) february 14, 1994 Ultrix 4.4 NetBSD 0.9 august 23, 1993 2.11BSD patch 100 january 1993 HPBSD 2.0 april 1993 Solaris 2.0 (sparc) (SunOS 5.0) july 1992 OSF/1 1992 Sinix 5.20 1990 Unicos 5.0 may 15, 1989 BOS 1989 Coherent 3.0 A/UX 1993 Ultrix 4.3A NetBSD 0.8 april 20, 1993 386 BSD 0.1 july 14, 1992 MIPS OS RISC/os 5 4.4BSD alpha june 1992 AOS Reno 1992 SunOS 4.1.2 (Solaris 1.0.1) december 1991 NeXTSTEP 2.1 march 25, 1991 Mach 3 Atari Unix 1989 HP-UX 3.1 feb. 1989 HP-UX 6.5 (S300) H1 1989 BSD/OS 1.0 (BSDI) Ultrix 4.3 386 BSD 0.0 february 1992 2.11BSD february 1992 NeXTSTEP 2.0 sept. 18, 1990 Trusted XENIX 2.0 january 9, 1991 Plan 9 1990 AMiX 1.1 (Amiga Unix SVR4) 1990 UNIX System V/386 Release 4 Unicos 4.0 july 15, 1988 CTIX 3.2 UNIX Interactive 4.1 1988 HP-UX 2.0 HP-UX 2.1 (S800) (S800) march 1988 july 1988 OSF/1 1990 NonStop-UX B00 august 22, 1989 SCO UNIX System V/386 release 3, 1989 UNIX System V Release 4 1988 Chorus/MiX V3.2 1988 HP-UX 6.0 (S300) H2 1987 SunOS 4.1.1 (Solaris 1) november 1990 SunOS 4.1 march 1990 Mach 2.6 IRIX 3.0 june 10, 1988 SCO XENIX System V/386 release 2.3.4 june 1989 UNIX System V/386 Release 3.2 CTIX 3.0 HP-UX 5.2 (S300) H2 1987 1992 BSD/386 0.3.2 (BSDI) february 28, 1992 RISC iX 1.21 1991 mt Xinu mach 2.6 QNX 4.0 1990 SunOS 4.0.3 may 1989 NeXTSTEP 1.0 september 18, 1989 UNIX Time-Sharing System Tenth Edition (V10) october 1989 Sinix 2.1 1988 IRIX 2.0 NonStop-UX november 18, 1987 april 10, 1987 SCO XENIX System V/386 october 1987 IRIS GL2 6.0 1987 Unicos 3.0 september 25, 1987 Microport Unix V/386 september 1987 HP-UX 1.1 HP-UX 1.2 (S800) (S800) august 17, 1987 nov. 16, 1987 Minix Open Systems january 26, 2013 © Éric Lévénez 1998-2013 <http://www.levenez.com/unix/> 1991 BSD Net/2 (4.3BSD Lite) june 1991 4.3BSD Reno june 1990 QNX 2.21 SunOS 4.0 1989 Mach 2.5 1988 Sinix 2.0 1987 UNIX System V Release 3.2 1987 CTIX/386 Microport Unix SV/AT january 1986 IBM IX/370 1985 HP-UX 2.1 (S200) H1 1985 Locus 1983 UCLA Locus 1981 SunOS 3.5 1988 NeXTSTEP 0.8 october 12, 1988 UNIX Time-Sharing System Ninth Edition (V9) september 1986 IRIS GL2 4.0 march 1986 UNIX System V/286 1985 UNIX System V Release 2 april 1984 TS 5.0 1982 SPIX 1982 UCLA Locally Cooperating Unix Systems 1980 IBM AOS 1988 Mach 2.0 UNIX Time-Sharing System Eighth Edition (V8) february 1985 IRIS GL2 1.0 1983 UNIX System V january 1983 UNIX System IV 1982 Dynix 1984 TS 3.0.1 1980 Interactive IS/1 Eunice 4.3 1987 BRL Unix (4.3BSD) 1986 Mach 1985 XENIX 3.0 april 1983 XENIX 2.3 UNIX System III november 1981 TS 3.0 1979 PC/IX RT 1.0 1977 Ultrix 4.2A 2.10.1BSD january 1989 MIPS OS RISC/os 4 Eunice 4.2 1985 SunOS 1.2 january 1985 Plurix 1982 Sinix XENIX OS august 25, 1980 CB UNIX 3 USG 3.0 USG 2.0 TS 2.0 1978 TS 1.0 1977 MERT 1974 4.3BSD Tahoe june 1988 HPBSD 1.0 april 1988 HPBSD 1987 QNX 2.0 QNX 1.0 1984 Eunice 2 1982 UNIX Time-Sharing System Seventh Edition (V7) january 1979 IRIX 1986 PWB/UNIX 1974 mt Xinu mach386 more/BSD december 1988 2.10BSD april 1987 4.3BSD june 1986 4.2BSD september 1983 4.1cBSD december 1982 QUNIX 1981 The Wollongong Group Eunice (Edition 7) 1980 UNSW 04 november 1979 Mini Unix may 1977 UNIX Time-Sharing System First Edition (V1) november 3, 1971 1990 Acorn RISC iX 1989 BSD Net/1 november 1988 mt Xinu (4.3BSD) 2.9BSD-Seismo august 1985 2.9.1BSD november 1983 MIPS OS RISC/os 4.0BSD october 1980 3BSD march 1980 UCLA Secure Unix 1979 SRI Eunice UNSW UNICS september 1969 Minix 3 V3.1.6 february 8, 2010 AIX 6.1 TL5 april 2010 HP-UX 11.31/11iv3 Update 7 (1009) september 2010 Minix 3 V3.1.7 june 16, 2010 AIX 7 open beta july 14, 2010 AIX 5L v5.3 TL11 october 2009 AIX 5L v5.3 TL12 april 2010 Minix 3 V3.1.8 october 4, 2010 AIX 6.1 TL6 september 2010 AIX 7.1 september 10, 2010 HP-UX 11.31/11iv3 Update 8 (1103) march 2011 HP-UX 11.31/11iv3 Update 9 september 2011 HP-UX 11.31/11iv3 Update 10 march 2012 HP-UX 11.31/11iv3 Update 11 september 2012 Minix 3 V3.2.0 february 29, 2012 AIX 6.1 TL7 october 2011 AIX 7.1 TL1 october 2011 AIX 5L v5.3 TL12 SP5 october 21, 2011 AIX 6.1 TL8 november 9, 2012 AIX 7.1 TL2 november 9, 2012 AIX 5L v5.3 TL12 SP6 june 27, 2012 Linux 3.7.4 january 21, 2013 1 · Toolbox basis Why use Unix? · 1.1 Why GNU/Linux I It’s for free. Imagine you’d have to pay for the software you use... I It’s free. Free and Open Source Software (FOSS) ⇒ allows for a look under the hood, and even distribution of your own modifications. I It gives you freedom. If there’s only FOSS on your device, you’re in control of what it does. Otherwise, you’re not! Getting GNU/Linux Debian Arch Linux Ubuntu Linux Mint Fedora many more http://www.debian.org/ http://www.archlinux.org/ http://www.ubuntu.com/ http://linuxmint.com/ http://fedoraproject.org/ http://distrowatch.com/dwres.php?resource=major “Live systems” are not an appropriate substitute: Poor performance, no persistence, not extensible, ... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 10 1 · Toolbox basis Why use Unix? · 1.1 Without Linux you’re pretty much doomed Workstations are available in the pools: V304 a few Linux boxes, some Macs, and some empty space to work with your own machines. Y326 about 4 PCs (rear left), the rest of the lot belongs to the economists (FB WiWi). Log in with your RZ-Account5 . For those with an Apple I Mac OS X seems to be suitable for this lecture. I You will encounter subtle deviations, i.e., some commands will not behave as shown in this lecture. I All the software we use should be availabe for you, too. Remote login to one of our compute servers (cf. page 15). 5 i.e., user.name as in your email address @uni..., and the according password Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 11 1 · Toolbox basis 1.2 What the shell... · 1.2 What the shell... What is the shell? I A command interpreter, it reads the commands you type in, and executes them. (...um, yes, it does look somewhat like MS-DOS) Why use this arcane style of user interaction? I It’s a flexible, versatile, and powerful tool, I it’s available on all unices, I it can be combined easily with other tools, I it can be used to easily combine other tools, I and works locally as well as remote. Graphical User Interfaces (GUIs) abstract from fundamental concepts, making the most common tasks easy, and everything else impossible. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 12 1 · Toolbox basis What the shell... · 1.2 Showroom Try this using your OS’s standard GUI tools: Find the top ten words in a file: I 1 grep -o -E '\w+' bigFile.txt | sort -f | uniq -ci | sort -rn | head -10 For all files in this folder, change the suffix from jpg to jpeg. I 1 ls | sed -n 's/^\(.*\)jpg$/mv & \1jpeg/p' | sh Basically, in the shell you organise the way in which a bunch of small tools cooperate to get the job done. The aim of this course is to give you I an introduction to some of these small tools, I the knowledge how to combine them, and I an idea about where to find tools yet unknown to you. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 13 1 · Toolbox basis What the shell... · 1.2 Getting a shell Linux & Mac There are various terminal emulators available. On Linux, xterm is most commonly installed. On Mac OS X, look for the “Terminal” application. What you see inside the window is the shell. The window itself is a terminal emulator. More on that later. sk@phobos90:~$ Windows Users For the long term, you need to toss Windows, and install Linux. Seriously! I • But better don’t fry your only running system during the semester. I Until then, you may try to • establish a remote session with our compute servers (cf. page 15), • or install a virtual machine6 running Linux. • Emulations (e.g., CygWin) are not recommended. 6 e.g., Virtual Box, http://www.virtualbox.org/ Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 14 1 · Toolbox basis 1.3 Remote login to titan07 · 1.3 Remote login to titan07 If you suffer from having Windows installed, there’s help: I Use the compute server titan07.inf.uni-konstanz.de, I log in via Secure Shell (SSH) user name the pop number7 associated with your RZ-account password the password you use for email. PuTTY8 is a free SSH client for Windows. A series of screenshots demonstrating how to log in is in the lecture’s public repository9 . 7 https://www.rz.uni-konstanz.de/angebote/e-mail/usermanager/ 8 http://www.chiark.greenend.org.uk/~sgtatham/putty/ 9 https://svn.uni-konstanz.de/dbis/sq_15w/pub/putty.zip Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 15 1 · Toolbox basis Remote login to titan07 · 1.3 Secure Shell from the Command Line Of course, you can also use SSH from Linux & Mac OS. ssh user@host 1 2 3 4 5 6 7 8 9 10 z Log in to host as user, and run the default shell. sk@verne:~$ ssh [email protected] # on my machine The authenticity of host 'titan07.inf.uni-konstanz.de (134.34.224.26)' can't be established. ECDSA key fingerprint is 16:7e:fc:e6:bb:9d:f7:e8:bd:4c:4b:f6:66:bc:27:9d. Are you sure you want to continue connecting (yes/no)? yes # only once Warning: Permanently added 'titan07.inf.uni-konstanz.de' (ECDSA) to the list of known hosts. Password: # RZ-password that came with your mail acount Last login: Thu Jan 24 13:57:37 2013 from verne.inf.uni-konstanz.de pop09951@titan07 ~ $ # now I’m working on titan07 I You should check that the shown key fingerprint appears in the provided10 listing. I We will return to SSH later in this course. 10 https://svn.uni-konstanz.de/dbis/sq_15w/pub/titan07-fingerprints Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 16 2 First steps $ 2 · First steps 2.1 Anatomy of a shell · 2.1 Anatomy of a shell There are many shells. We use the GNU Bourne-Again SHell: bash. 1 I I pop09951@titan07 ~ $ You should see a cursor , its appearance may vary, sometimes it even blinks. The text to the left of the cursor ist the shell prompt. • May vary between hosts, it may be as modest as a plain $. • On titan07 it is: user@host ~ $. I Type in a command. You’ll learn a lot of commands this term... 1 2 3 pop09951@titan07 ~ $ echo hello shell hello shell pop09951@titan07 ~ $ Basic work cycle The shell prompts you → You type in something → The shell runs it → You see the results, and another prompt. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 18 2 · First steps Anatomy of a shell · 2.1 What is a command? The shell needs to interpret your input to find out what to do. For simple commands11 , the procedure is: 1. The shell splits your input into words. 2. The first word specifies the command to be executed. 3. The remaining words are passed as arguments to the invoked command. Review the example: 1 2 3 pop09951@titan07 ~ $ echo hello shell hello shell pop09951@titan07 ~ $ I What is the command being run? I What are the arguments? I What does the command do with its arguments? Note The splitting into words can be quite tricky. Avoid whitespace in file names! (more on that later) 11 we will discuss not so simple commands later in this course Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 19 2 · First steps 2.2 What’s around us? · 2.2 What’s around us? nano I Unix-principle: Everything is a file. I Files are organized in directories, which form a hierarchical structure. I The root directory is denoted by a slash: /. I A file is uniquely identified by its path from the root, e.g., /usr/bin/ssh. The shell has a current working directory, the command pwd prints its name. I 1 2 bash bin/ ls boot/ cp dev/ init.d/ skel/ etc/ / home/ nanorc ssh tmp/ sbin/ pop09951@titan07 ~ $ pwd /home/pop09951 usr/ Initial location: your home directory. Stefan Klinger · DBIS passwd Key Competence in Computer Science · Winter 2015 var/ bin/ rsync local/ sort lib/ man/ share/ 20 2 · First steps Looking and moving around · 2.3 Looking around ls [-l] [-a] [name...] z List the current directory, or names if given. Use -l to get the long listing, -a to see all files. 1 2 3 I Arguments are passed to a command by typing them right behind the command name, separated by spaces. I It is common style in Unix documentation to mark optional arguments with brackets [·], and an ellipses ... indicates optional repetition. pop09951@titan07 ~ $ ls pop09951@titan07 ~ $ ls -a . .. .bash_history .bash_logout I # No output ⇒ nothing here? .k5login .mateconf .profile .ssh By convention, a file whose name starts with a dot is not listed. ⇒ dot-file is Unix jargon for “hidden file”. Which of the listed items are files, which are directories? Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 21 2 · First steps Looking and moving around · 2.3 Getting more details 2 3 4 5 6 7 8 9 d rwxr-xr-x Stefan Klinger · DBIS siz e in by t d= =re di gu re la ct r fi or le y , 10 pop09951@titan07 ~ $ ls -a -l # short options may be combined, e.g., ls -la total 32 # 1k-blocks allocated in this dierectory by listed items drwx------ 4 pop09951 domain_users 4096 Feb 11 11:31 . drwxr-xr-x 22 root root 4096 Jan 31 09:38 .. -rw------- 1 pop09951 domain_users 993 Feb 11 11:32 .bash_history -rw------- 1 pop09951 domain_users 220 Apr 3 2012 .bash_logout -rw-r--r-- 1 pop09951 domain_users 54 Feb 11 11:31 .k5login drwx------ 2 pop09951 domain_users 4096 Feb 11 12:29 .mateconf -rw------- 1 pop09951 domain_users 675 Apr 3 2012 .profile drwxr-xr-x 2 pop09951 domain_users 4096 Feb 11 11:32 .ssh es tim m e od o ifi f l ca as tio t n fil e na m e 1 2 pop09951 domain_users 4096 Feb 11 11:32 Key Competence in Computer Science · Winter 2015 .ssh 22 2 · First steps Looking and moving around · 2.3 Moving around Why move at all? Unless told otherwise, the commands you issue always work on the current working directory, e.g., ls lists only the current directory. I pwd z Print cd [dir] working directory. z Change directory to dir, or to home directory if dir is omitted. If you change the directory, you’ll see the prompt change as well: The tilde ~ is a common abbreviation for your home directory. I 1 2 3 4 5 6 7 pop09951@titan07 ~ $ cd /etc pop09951@titan07 /etc $ pwd /etc pop09951@titan07 /etc $ cd schluargl -bash: cd: schluargl: No such file or pop09951@titan07 /etc $ cd pop09951@titan07 ~ $ # switch to another directory # you run pwd in the directory /etc # response printed by pwd # it’s an error if the directory does not exist... directory # ...and we do not move # no argument, so go back home ⇒ The default prompt shows the shell’s working directory. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 23 2 · First steps Looking and moving around · 2.3 Paths can be absolute (leading slash /), or relative to the current directory (without leading slash). I 1 2 3 pop09951@titan07 whatever $ cd /usr pop09951@titan07 /usr $ cd local/bin pop09951@titan07 /usr/local/bin $ # an absolute path # from /usr go to local/bin Every directory contains two extra entries: I . refers to the directory it appears in, and .. refers to the respective parent directory. 4 5 6 7 8 I pop09951@titan07 pop09951@titan07 pop09951@titan07 pop09951@titan07 pop09951@titan07 /usr/local/bin $ cd .. /usr/local $ cd ../share /usr/share $ cd . /usr/share $ cd ../lib/../bin /usr/bin $ # go to parent directory # go to sibling named share # stay in the current directory # intermediate dirs must exist Two more commands: mkdir dir z Make rmdir dir z Remove Stefan Klinger · DBIS directory dir. directory dir, fails if dir is not empty. Key Competence in Computer Science · Winter 2015 24 2 · First steps 2.4 I I Editing files · 2.4 Editing files There are a lot of text editors available. You may choose any editor you like if it is suitable for plain text. Microsoft Word is not! We’ll discuss later what plain text actually means. I I’ll show you nano for a quick start, and more complex editors later in the lecture. nano [file] z Run nano and open file, or an empty buffer if omitted. I The prompt disappears, you’re inside nano now. Type some text... I At the bottom, some key bindings are shown. Type ˆO (i.e., Ctrl-O) to save the file, name it “greeting”. Type ˆX to end nano. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 25 2 · First steps 2.5 Managing files · 2.5 Managing files rm [-r] file... z Remove files. I Recursively remove all subdirectories, if -r is given. cp [-r] source dest z Copy source to dest. I If dest is a directory, copy into it. I If source is not a directory, and a file dest exists, overwrite it. mv source dest z Move source to dest. I If dest is a directory, move into it. I If dest does not exist, rename source to dest. I If source is not a directory, and a file dest exists, replace it. Watch out Use all these commands with extreme care. There’s no safety net, i.e., no “trashbin”, and no “undo”. The shell is a pretty good place to loose data. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 26 2 · First steps 2.6 Looking into files · 2.6 Looking into files So here’s our new file: 1 2 pop09951@titan07 ~ $ ls -l greeting -rw------- 1 pop09951 domain_users 12 Jan 31 21:03 greeting file file... cat file... less file z What kind of data is in the file? z Concatenate z Show files and print their contents. file and allow scrolling and searching. Press h for help. (about the name: less is more, improved. more is the traditional Unix file browser) 3 4 5 6 7 pop09951@titan07 ~ $ file greeting greeting: ASCII text pop09951@titan07 ~ $ cat greeting hello world pop09951@titan07 ~ $ less /usr/share/games/fortunes/literature Exercise Play with these commands. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 27 2 · First steps 2.7 Getting help · 2.7 Getting help Most important section! There are various help systems around on a Unix system: man The standard means of documentation on Unix: manual pages. help Documentation of shell builtins. info Arcane hypertext format, commonly used by GNU projects. And there’s the web, of course. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 28 2 · First steps Getting help · 2.7 Unix manual pages I Manual pages (aka. man-pages) are organized in sections. I Different pages with the same name may exist in different sections. I Common notation to reference a manual page: name(section). man [section] name z Show the man-page about name (i.e., program, utility, or function). Limit search to a section, or show first match. whatis name z Shows apropos keyword header of named manual pages across sections. z Search manual page descriptions for keyword. A list of sections is available in man(1), the man-page about man. I Usually the cursor keys and page keys should work for scrolling. Otherwise, use f (b) to scroll forward (backward). I Press h for help, and q to quit. Exercise Check out nano(1), and the pages of other commands you have seen so far. (No need to read’em all — for now.) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 29 2 · First steps Getting help · 2.7 Help from the shell 1 2 $ man cd # There might be no man-page for the cd command. No manual entry for cd The command cd is a shell builtin, i.e., part of the shell. So it may not have its own manual page (on some systems it has). type [name...] z For help [builtin...] 3 4 5 6 7 8 each name show whether it’s a shell builtin. z Display information about builtin commands. $ type cd nano cd is a shell builtin nano is /usr/bin/nano $ help cd cd: cd [-L|[-P [-e]]] [dir] Change the shell working directory. # so this is part of the shell # path to program to be run # i have pruned the output a little Being part of the shell, cd (and all other builtins) are also documented in bash(1). Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 30 2 · First steps Getting help · 2.7 The info system The info system is accessed through an interactive hypertext browser12 . info [item...] z Run the info reader, and display item from the menu, or start at the directory node, which gives a menu of major topics. Again, info is controlled with keystrokes. Most important: h displays a list of key bindings. H brings up the info manual. Tab Jump to the next link. Return Follow the link under the cursor. q Quit. Exercise There’s an item on “nano”. Go, have a look. 12 The roots of this system predate the success of HTML. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 31 2 · First steps 2.8 Useful keystrokes · 2.8 Useful keystrokes You’ll be typing a lot on the shell. Here’s how you can be even faster: Up/Down Scroll backward/forward in the history of commands that you have used recently. C-r Start a reverse incremental search in the history. Type ahead for searching, or type C-r to search for the same pattern further backwards, Return to run the line displayed, or Left/Right to edit the line. Tab Word completion tries to complete a command if in the first word, or a filename thereafter. Type Tab twice to see a list of possible completeions. C-c Cancel the current line without executing it, or interrupt the running process. C-s/C-q Flow control used to stop/resume terminal output. If your terminal “hangs”, try C-q first. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 32 3 Subversion 3 · Subversion 3.1 I Subversion... as in coup d’état? · 3.1 Subversion... as in coup d’état? Subversion (SVN) is a version control system. As such it provides means • • • • to keep track of changes between versions of a project, to allow people to concurrently edit the project, to resolve conflicting edits, and to revert to earlier versions of a project. ⇒ SVN keeps a log of all changes! I We consider this lecture a project. I do the slides, you solve the exercises, and the tutors revise them. I SVN offers sufficient access control to isolate parts of the project. • Everybody can read the public directory. That’s where the lecture slides and assignments are published. • Each group of students has read & write access to their subdirectory. • The tutors have read & write access to all group’s directories, and probably some other rights. • My boss and I have global access to read & write. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 34 3 · Subversion Subversion... as in coup d’état? · 3.1 Basic SVN work cycle 1. Check out a working copy from the central repository. 2. Edit as you like, maybe add further files to the project. • If you screwed up, revert to the version you have checked out. 3. Update your working copy to reflect changes others have committed in the meantime. • SVN tries to merge new changes into your working copy. • You need to resolve conflicts where SVN fails to guess right. 4. Commit your changes to the central repository. 5. goto 2 (no need to check out again) Note SVN can only merge edits in plain text data. It cannot trace changes in binaries (images, PDF or “office” documents, compiled programs) I So do not add binaries to the repository if not absolutely necessary. (Although SVN does have space-efficient binary-diff storage) I It is better to add the source that generates the binaries. ⇒ C sources instead of compiled programs! Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 35 3 · Subversion Subversion... as in coup d’état? · 3.1 Important advice I Do not add files unrelated to this course. I Only add the files asked for in the assignment. I Do not add generated or downloaded data13 . It is extremely difficult to remove data from the repository history, once it has been added. 13 Better add information about how to generate, or where to find the data. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 36 3 · Subversion Subversion... as in coup d’état? · 3.1 SVN’s command line interface SVN’s understanding of arguments differs from traditional Unix tools: I SVN can perform a bunch of different tasks. I Instead of providing a separate program for each of them, there’s only one svn program. (well, on the client side) I Its first argument decides what task to perform. svn subcommand [argument...] z Execute one of svn’s many subcommands with the appropriate arguments. svn help [subcommand] one specified. Stefan Klinger · DBIS z Display a list of subcommands, or help for the Key Competence in Computer Science · Winter 2015 37 3 · Subversion 3.2 Checkout — svn co · 3.2 Checkout — svn co svn co url z Get a working copy from the repository at url.14 To check out the “public”15 part of the lecture’s repository: 1 2 3 4 5 6 7 8 9 10 11 12 I Due to magic, it is the same URL as on page 6. I I’d suggest doing all this in a dedicated subdir, e.g., ~/sq_15w. sq_15w $ svn co --username your.name https://svn.uni-konstanz.de/dbis/sq_15w/pub/ Authentication realm: <https://svn.uni-konstanz.de:443> Uni Konstanz Subversion Repository Password for 'your.name': **** # the password for [email protected] # ... a warning message about storing passwords unencrypted ... Store password unencrypted (yes/no)? yes # your choice # ... list of what's being checked out, may be empty ... Checked out revision 73. # number may differ sq_15w $ ls pub sq_15w $ ls pub/ # ... you should see lecture slides here 14 http://svnbook.red-bean.com/en/1.7/svn.tour.initial.html 15 due to the current setup, authentification is required Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 38 3 · Subversion I Checkout — svn co · 3.2 To check out your group’s (say foobar) private directory16 : 13 14 15 16 17 18 sq_15w $ svn co --username your.name \ > https://svn.uni-konstanz.de/dbis/sq_15w/group/foobar # You will not be asked for credentials, if you have stored them in the previous step # ... list of what’s being checked out, may be empty ... $ ls foobar pub 19 I You now have working copies of two different subdirectories of the repository https://svn.uni-konstanz.de/dbis/sq_15w: pub is a copy of ^/pub/ foobar is a copy of ^/group/foobar/ (Where ^ is an abbreviation for the repository’s location.) 16 You’ll learn your group’s name from assignment 1, cf. page 6. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 39 3 · Subversion 3.3 Query status / Add files — svn st / svn add · 3.3 Query status / Add files — svn st / svn add svn st z List files with changes, and new files not yet under SVN’s control. No output ⇒ nothing to report.17 svn add file... I z Schedule files for addition to the repository.18 SVN does not automatically take care of files you create in a working copy. You need to tell SVN which files to add to the repository. 20 21 22 23 24 25 26 27 sq_15w $ cd foobar # in your group's directory sq_15w/foobar $ nano newfile # create a new file sq_15w/foobar $ svn st ? newfile # huh? — SVN does not yet handle this file sq_15w/foobar $ svn add newfile A newfile # scheduling this file for Addition sq_15w/foobar $ svn st A newfile # SVN is planning to Add this file to the repository 17 http://svnbook.red-bean.com/en/1.7/svn.tour.cycle.html#svn.tour.cycle.examine 18 http://svnbook.red-bean.com/en/1.7/svn.tour.cycle.html#svn.tour.cycle.edit Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 40 3 · Subversion 3.4 Commit changes — svn ci · 3.4 Commit changes — svn ci svn ci -m 'message' [path...] z Commit (aka., check in) all changes, providing a log message. The commit can be limited to certain paths.19 I When satisfied with your changes (e.g., adding a file), you need to commit them to the repository. I You have to provide a log message. Without -m, an editor (probably nano) will be launched where you can enter a message. 28 29 30 31 32 33 34 sq_15w/foobar $ svn st A newfile # SVN is planning to Add this file to the repository sq_15w/foobar $ svn ci -m 'blah blah blah' # use concise messages Adding newfile # actually adding this file Transmitting file data . Committed revision 14. sq_15w/foobar $ svn st # no output 19 http://svnbook.red-bean.com/en/1.7/svn.tour.cycle.html#svn.tour.cycle.commit Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 41 3 · Subversion 3.5 I Updating your working copy — svn up · 3.5 Updating your working copy — svn up You should regularly update your working copies to receive any changes committed by your fellow students, tutors, or lecturers. svn up [path...] z update the working copies in each path, or the current working directory if no path is given.20 35 36 37 38 39 40 41 I sq_15w $ svn up pub At revision 16. sq_15w $ svn up foobar A foobar/greeting Updated to revision 16. sq_15w $ ls foobar greeting newfile # no new lecture slides in pub. # greeting has been Added. How did greeting get there? Probably someone in your group was a bit faster with the homework. 20 http://svnbook.red-bean.com/en/1.7/svn.tour.cycle.html#svn.tour.cycle.update Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 42 3 · Subversion 3.6 Show modifications — svn di · 3.6 Show modifications — svn di svn di [file...] I the differences in the given files, or all files.21 When editing, svn st tells you which files are new, or have changes, svn di lists the differences. In a format known as unified diff. 42 43 44 45 46 47 48 49 50 51 52 I z Show sq_15w/foobar $ nano greeting # change the file your fellow has checked in sq_15w/foobar $ svn st M greeting # this file has local Modifications sq_15w/foobar $ svn di greeting Index: greeting =================================================================== --- greeting (revision 95) # working copy was last updated to rev 95 +++ greeting (working copy) @@ -1 +1 @@ -hello # this line has been removed +hello world # this line has been added You may commit modifications with svn ci, cf. page 41. 21 http://svnbook.red-bean.com/en/1.7/svn.tour.cycle.html#svn.tour.cycle.examine Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 43 3 · Subversion 3.7 Undo your changes — svn revert · 3.7 Undo your changes — svn revert svn revert file... or checkout.22 I changes on file made since the last update If you are unhappy with your modifications, ou may revert to the version of your last update. 53 54 55 I z Revert sq_15w/foobar $ svn revert greeting Reverted 'greeting' sq_15w/foobar $ svn st # no output The svn merge command23 even allows to “undo” previously committed revisions. See the manual, this is rather advanced! 22 http://svnbook.red-bean.com/en/1.7/svn.tour.cycle.html#svn.tour.cycle.revert 23 http://svnbook.red-bean.com/en/1.7/svn.branchmerge.basicmerging.html Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 44 3 · Subversion 3.8 I When things go awry · 3.8 When things go awry There are many possibilities where conflicts can arise. • Two people edit the same region of a file in different ways, so that SVN cannot merge the edits. • You try to add a file that someone has added after your last checkout. • ... I If your working copy of modified files is out of date, then svn ci will fail. You need to update first. I The update may fail, leaving you enough information to sort out the mess. Note Resolution of tree conflicts (cf. page 49) has changed significantly since Subversion 1.5. A rather complex example24 can be found in the manual. 24 http://svnbook.red-bean.com/en/1.7/svn.tour.treeconflicts.html Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 45 3 · Subversion When things go awry · 3.8 A text conflict Text conflicts arise, when SVN cannot merge concurrent edits that happened to the same region of a file. I Assume you have edited a file data which is already part of the repository: 1 2 sq_15w/foobar $ svn st M data # There are local Modifications, waiting to be committed But when you try to commit, something strange may happen: I 3 4 5 6 7 sq_15w/foobar Sending svn: E155011: svn: E155011: svn: E160028: $ svn ci -m 'better now' data Commit failed (details follow): File '/home/sk/sq_15w/foobar/data' is out of date File '/group/foobar/data' is out of date # local path # path in repos • The local working copy of data is out of date, i.e., someone else has committed changes of data after your last update. • SVN has failed to automatically merge your edits with those already committed. I So you first have to update your working copy. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 46 3 · Subversion When things go awry · 3.8 Let’s do so: I 8 9 10 11 12 13 14 15 16 17 sq_15w/foobar $ svn up Updating '.': C data # this file contains the conflict Updated to revision 3. Conflict discovered in file 'data'. Select: (p) postpone, (df) show diff, (e) edit file, (m) merge, (mc) my side of conflict, (tc) their side of conflict, (s) show all options: p # see the manual for the other options Summary of conflicts: Text conflicts: 1 # there is one text conflict The update creates a bunch of new files: I 18 19 20 21 22 23 24 sq_15w/foobar $ svn st C data # merged version of data, contains conflict markers ? data.mine # backup of your version of data ? data.r2 # copy of revision 2 of data, i.e., before your edits ? data.r3 # copy of revision 3 of data, i.e., the one up to date Summary of conflicts: Text conflicts: 1 Use an editor to fix data, using the other files for reference: I 1 $ nano data Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 47 3 · Subversion When things go awry · 3.8 Resolving conflicts — svn resolve svn resolve --accept=working file... z Use the current version of files to resolve conflicts. See the manual for other values than working to use with --accept.25 You cannot commit the working copy until the conflict is resolved: I 1 2 3 sq_15w/foobar $ svn ci -m 'better now' svn: E155015: Commit failed (details follow): svn: E155015: Aborting commit: 'sq_15w/foobar/data' remains in conflict You need to tell SVN that you are done with resolving the conflict: I 4 5 6 7 8 9 10 11 sq_15w/foobar $ svn resolve --accept=working data # tell SVN: conflict is resolved Resolved conflicted state of 'data' sq_15w/foobar$ svn st M data # no conflict any more, but uncommitted modifications sq_15w/foobar$ svn ci -m 'merged my edits' # try to commit again, may fail again Sending data Transmitting file data . Committed revision 4. 25 http://svnbook.red-bean.com/en/1.7/svn.tour.cycle.html#svn.tour.cycle.resolve Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 48 3 · Subversion When things go awry · 3.8 A tree conflict Tree conflicts arise, when SVN cannot merge concurrent changes in the directory structure. I Assume you have created file1, but not (yet?) scheduled it for addition to the repository. 1 2 sq_15w/foobar $ svn st ? file1 # SVN does not care about this file But when you try to update, something strange may happen again: I 1 2 3 4 5 6 7 8 9 sq_15w/foobar $ svn up Updating '.': C file1 At revision 4. Tree conflict on 'file1' > local file unversioned, incoming file add upon update Select: (r) mark resolved, (p) postpone, (q) quit resolution, (h) help: p Summary of conflicts: # again, we have typed p here Tree conflicts: 1 • Someone else has added a file1 after your last update. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 49 3 · Subversion When things go awry · 3.8 Here we can choose one of two options: Option 1: revert file1 will toss your version of the file, and replace it with the one already committed. I • You revert to the version where file1 did not exist, then it’s added by the update. 1 2 3 4 sq_15w/foobar $ svn revert file1 sq_15w/foobar $ svn st # No output, conflict disappeared sq_15w/foobar $ ls file1 # this is the file that came from the repository Option 2: Resolve by accepting the state of the working copy. This will schedule file1 for deletion. I • Note, that this is consistent: In your working copy, file1 is not part of the repository. 1 2 3 4 5 6 7 sq_15w/foobar $ svn resolve --accept=working file1 Resolved conflicted state of 'file1' sq_15w/foobar $ svn st D file1 # Note the D: scheduled for deletion. sq_15w/foobar $ svn ci -m'file1 should not be checked in' Deleting file1 Committed revision 9. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 50 3 · Subversion 3.9 Tree changes — svn rm/mv/cp · 3.9 Tree changes — svn rm/mv/cp You must not (re)move files under SVN’s control using rm or mv. I SVN will report moved files as missing, since it cannot trace the OS’s own mv command. I Use SVN’s subcommands to (re)move or copy files26 : svn rm file... I z Mark files for removal from the repository. svn mv src dst z Rename/move svn cp src dst z Copy files under svn’s control. files under svn’s control. Conflicting operations on the directory structure may also lead to conflicts on update. 26 http://svnbook.red-bean.com/en/1.7/svn.tour.cycle.html#svn.tour.cycle.edit Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 51 4 Scientific text processing with LATEX {\LaTeX} LATEX 4 · Scientific text processing with LATEX 4.1 Background · 4.1 Background You might come across the following soon: Mathe-Übungsblatt: Die Abgabe der Lösung als aus LATEX erzeugte PDF-Datei ist bis Freitag... I LATEX is the document preparation system in the natural sciences, mathemetics, and computer science. • Scientific publications are likely to be prepared using LATEX. You may even have to use it (publisher, co-authors). • You’ll be asked to prepare your thesis using LATEX. • We use LATEX a lot ourselves. • Sometimes, it is quite a nuisance. I Being a plain text format (some say: programming language), the source code is accessible to the Unix toolbox you’re here for. I It producdes high quality output, especially when it comes to math. n X i= i=1 Stefan Klinger · DBIS n · (n + 1) 2 1 \sum_{i=1}^ni = \frac{n\cdot(n+1)}2 Key Competence in Computer Science · Winter 2015 53 4 · Scientific text processing with LATEX 4.2 Getting Getting LATEX · 4.2 LAT EX TEX Live is a TEX distribution, likely to contain all packages and add-ons you’ll ever need. I Most GNU/Linux distributions should offer TEX Live through their package management system. I TEX Live is installed ready-to-use on titan07. Downloading the TEX Live distribution27 is probably the next easiest way to use LATEX on almost any platform. I Is it TEX or LATEX? I TEX is the core language and the engine of the text processing system. LATEX is a macro package for TEX. I Since plain TEX sucks, everybody uses LATEX, which sucks less. I You can ignore the fact that there’s TEX, until you encounter a problem that must be solved in TEX... 27 http://www.tug.org/texlive/ Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 54 4 · Scientific text processing with LATEX 4.3 1 2 3 4 5 LATEX basics · 4.3 LAT EX basics % first.tex - A LaTeX example! \documentclass{article} \begin{document} Hello world, hello \LaTeX. \end{document} Use nano to create a file first.tex as shown. Good idea: Do this in a new directory. line 1 A comment, from % till end of line. This is ignored. 2 The \documentclass sets a couple of defaults, and provides commands, e.g., to form sections, specify paper size, ... 3,5 The body of the document goes into the document environment. This defines the visible contents of the document. I Commands are introduced by a backslash \, arguments are often passed in curly braces {·}. I Everything before \begin{document} is called the preamble. That’s where extensions are loaded and commands can be defined. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 55 4 · Scientific text processing with LATEX LATEX basics · 4.3 Compiling your first pdflatex file.tex file.tex. 1 2 3 4 5 6 7 8 LAT EX document z Compile PDF document file.pdf from ~/foo $ pdflatex first.tex This is pdfTeX, Version 3.1415926-1.40.10 (TeX Live 2009/Debian) entering extended mode # skipping some of the rather lengthy output Output written on first.pdf (1 page, 21210 bytes). Transcript written on first.log. ~/foo $ ls first.aux first.log first.pdf first.tex Have a look at your PDF document: Hello world, hello LATEX. I I I One of the very few occasions we make use of a graphics display. If you have compiled on your local machine, use any PDF viewer you like. (xpdf, acroread, evince, gv, ...) If you have compiled on titan07, you need to get the file first... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 56 4 · Scientific text processing with LATEX 4.4 Copying files from titan07 to a Windows host · 4.4 Copying files from titan07 to a Windows host I With ssh, you can log in to a remote machine, and work there. However, the data stays on that machine. I To transfer files between two machines, you need a different tool: Secure Copy (scp) Actually, you can get away with just ssh... I ssh and scp both use the SSH protocol: To authenticate yourself, use the same credentials as for ssh. WinSCP28 is a free SCP client for Windows. A series of screenshots is in the lecture’s public repository29 . 28 http://winscp.net/ 29 https://svn.uni-konstanz.de/dbis/sq_15w/pub/winscp.zip Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 57 4 · Scientific text processing with LATEX Copying files from titan07 to a Windows host · 4.4 Secure Copy from the Command Line (Linux & Mac OS) scp user@host:file dest z Copies file to destination, just like cp does. The prefix user@host: indicates a remote filename or directory. 1 2 3 4 $ scp [email protected]:first.pdf . Password: # RZ-password first.pdf 100% 21KB # Now the file first.pdf has been copied to my computer. # note the dot 20.7KB/s 00:00 And this also works in the other direction: 1 2 3 $ scp first.pdf [email protected]: Password: # RZ-password first.pdf 100% 21KB Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 # note the colon! 20.7KB/s 00:00 58 4 · Scientific text processing with LATEX 4.5 Umlauts · 4.5 Umlauts By default, LATEX does not understand umlauts. 1 Bse berraschung? 2 3 4 1 Böse Überraschung? ä ö ü, Ä Ö Ü, ß 2 3 4 5 1 2 Böse Überraschung? ä ö ü, Ä Ö Ü, ß 3 4 5 6 \documentclass{article} \begin{document} Böse Überraschung? \end{document} \documentclass{article} \begin{document} B\"ose \"Uberraschung?\\ \"a \"o \"u, \"A \"O \"U, \ss \end{document} \documentclass{article} \usepackage[utf8]{inputenc} \begin{document} Böse Überraschung?\\ ä ö ü, Ä Ö Ü, ß \end{document} % Note For the latter to work, make sure to save your files with UTF-8 encoding. We’ll cover text encodings later on in this course. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 59 4 · Scientific text processing with LATEX 4.6 Sections · 4.6 Sections The article class provides commands to stucture your document into \sections, \subsections, \subsubsections, and \paragraphs. 1 2 \documentclass[a4paper]{article} \begin{document} % Note the setting of the paper size 3 4 5 6 \section{First Section} Lines of text are wrapped as necessary. Newlines are ignored, but an empty line (i.e., two or more newlines) starts a new paragraph. 7 8 9 Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. 10 11 12 \subsection{Subsection One} Note, that section numbering happens automatically. 13 14 15 \subsubsection{Sub-Sub-Section} Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. 16 17 18 19 \paragraph{Paragraphs} Further division into paragraphs is possible, so in total you have five levels to structure your content. 20 21 \end{document} Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 60 4 · Scientific text processing with LATEX Sections · 4.6 The previous code renders into this: 1 First Section Lines of text are wrapped as necessary. Newlines are ignored, but an empty line (i.e., two or more newlines) starts a new paragraph. Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. 1.1 Subsection One Note, that section numbering happens automatically. 1.1.1 Sub-Sub-Section Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Paragraphs Further division into paragraphs is possible, so in total you have five levels to structure your content. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 61 4 · Scientific text processing with LATEX 4.7 List-like environments · 4.7 List-like environments You can use the environments itemize and enumerate to generate lists. 1. item one 1 2 2. item two 3 (a) aaa 5 (b) bbb 7 4 6 8 • This is an item, 9 10 • one with children \begin{enumerate} \item item one \item item two \begin{enumerate} \item aaa \item bbb \end{enumerate} \end{enumerate} 11 12 – a 13 – b 14 \begin{itemize} \item This is an item, \item one with children \begin{itemize}\item a\item b\end{itemize} \end{itemize} (Here, I have only shown the document body, not the preamble. If you’re formatting text, it goes into the body. Everything else goes into the preamble.) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 62 4 · Scientific text processing with LATEX 4.8 Math environments · 4.8 Math environments I LATEX offers a rich language to describe math formulae. I To use that language, you need to switch LATEX into math mode. Math mode? I All text is considered to be variables, operators, or commands, and typeset according to the structure of the formula, rather than the source code. I Spaces do not appear in the output. You’ll need special commands for spacing, but LATEX is doing a good job already. I Also, line breaks are sort of non-significant. But you are not allowed to have empty lines in math mode. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 63 4 · Scientific text processing with LATEX Show, that Math environments · 4.8 a2 −b2 a−b = a + b holds for real numbers a, b. For inline formulae, use the math environment: 1 2 3 4 5 Show, that \begin{math} \frac{a^2-b^2}{a-b} = a+b % \frac{a}{b} renders a/b \end{math} holds for real numbers \begin{math}a, b\end{math}. this can be abbreviated with \(·\), 1 Show, that \(\frac{a^2-b^2}{a-b} = a+b\) holds for real numbers \(a, b\). or even $·$ 1 Show, that $\frac{a^2-b^2}{a-b} = a+b$ holds for real numbers $a, b$. Also on this slide: Superscript a^b, fraction \frac{a}{b}. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 64 4 · Scientific text processing with LATEX Math environments · 4.8 ...non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. n X ai bn−i = a0 bn + a1 bn−1 + . . . + an−1 b1 + an b0 i=0 Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod... Use displaymath to set larger formulae in their own paragraph. 1 2 3 4 5 ...non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. \begin{displaymath} \sum_{i=0}^n a_i b_{n-i} = a_0 b_n + a_1 b_{n-1} + \dots + a_{n-1}b_1+ a_nb_0 \end{displaymath} Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod... which, again, can be abbreviated with \[·\] — You may come across the alternative $$·$$, but it might confuse the compiler. Do not use. Also on this slide: Subscript a_b, grouping with {·}, big sum \sum. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 65 4 · Scientific text processing with LATEX Math environments · 4.8 Equation 1 is known as the Law of sines. a b c = = sin α sin β sin γ (1) The equation environment is just like displaymath, but displays a running number on the right. Use \label and \ref for references. 1 2 3 4 5 Equation \ref{los} is known as the \emph{Law of sines}. \begin{equation} \frac{a}{\sin\alpha} = \frac b{\sin\beta} = \frac c{\sin\gamma} \label{los} \end{equation} Note that you have to run LATEX twice for references to work! Also on this slide: Trigonometric function \sin, greek letters \alpha..., emphasis \emph. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 66 4 · Scientific text processing with LATEX Math environments · 4.8 sin α = cos α = tan α = Using 1 and 2 in 3 yields tan α = opposite hypotenuse adjacent hypotenuse opposite adjacent (1) (2) (3) sin α cos α . The eqnarray environment allows you to align multiple formulae at an operator surrounded by ampersands &. The lines are separated by \\. 1 2 3 4 5 6 7 \begin{eqnarray} \sin\alpha &=& \frac{\textrm{opposite}}{\textrm{hypotenuse}} \label{sin} \\ \cos\alpha &=& \frac{\textrm{adjacent}}{\textrm{hypotenuse}} \label{cos} \\ \tan\alpha &=& \frac{\textrm{opposite}}{\textrm{adjacent}} \label{tan} \end{eqnarray} Using \ref{sin} and \ref{cos} in \ref{tan} yields \(\tan\alpha=\frac{\sin \alpha}{\cos\alpha}\). Also on this slide: \cos, \tan, font style \textrm to distinguish variables from text. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 67 4 · Scientific text processing with LATEX 4.9 Parenthesis · 4.9 Parenthesis With \left(·\right), you can build parenthesis that adapt to their contents: a2 + )=f f( x 1 2 3 4 5 a2 + x x { |x ≤ y} = y f(\frac{a^2+\epsilon}{x}) = f \left( \frac{a^2+\epsilon}{x} \right) \qquad % make a larger gap between them. \{ \frac xy | x\le y \} = \left\{ \frac xy \middle| x\le y \right\} x x≤y y % note: \{ instead of { \left and \right... I ...work with a number of characters, e.g., ()[]{} and |. I ...must be balanced. To hide one, specify a . dot (cf. page 70). I With \middle, you can add punctuation without leaving the group. Also on this slide: Create brace with \{, a gap with \qquad. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 68 4 · Scientific text processing with LATEX 4.10 Arrays · 4.10 Arrays You can use the array environment to create matrices. The argument, here {cccc}, specifies number and alignment of the columns (right, left, or centered). 1 a11 a21 .. . am1 a12 a22 .. . ··· ··· .. . a1n a2n .. . am2 ··· amn T 2 4 3 5 6 7 8 \left[ \begin{array}{cccc} a_{11} & a_{12} & a_{21} & a_{22} & \vdots & \vdots & a_{m1} & a_{m2} & \end{array} \right]^T \cdots \cdots \ddots \cdots & & & & a_{1n} a_{2n} \vdots a_{mn} \\ \\ \\ \\ Again, columns are separated by ampersands &, and lines by \\. Also on this slide: Create ellipses using \vdots, \ddots, and \cdots. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 69 4 · Scientific text processing with LATEX Arrays · 4.10 But you can also use array to align arbitrary formulae. Note the {rcl}. 1 42x y = = = = 2a + b 3a + 2b − 42x 3a + 2b − (2a + b) a+b 2 3 4 5 6 \begin{array}{rcl} 42x &=& 2a + b \\ y &=& 3a + 2b - 42x \\ &=& 3a + 2b - (2a + b) \\ &=& a + b \end{array} Particularly useful: Writing down cases. cn+1 = 1 2 3 4 5 6 c_{n+1} = \left\{ \begin{array}{ll} \frac{1}{2} c_n 3 c_n + 1 \end{array} \right. Stefan Klinger · DBIS 1 2 cn 3cn + if cn is even 1 otherwise & \textrm{if \(c_n\) is even} & \textrm{otherwise} Key Competence in Computer Science · Winter 2015 \\ 70 4 · Scientific text processing with LATEX 4.11 More math symbols · 4.11 More math symbols I I Logics & Arrows Set theory stuff ∀x.∃y.x 7→ y ¬A ∨ B ∧ C ⇒⇐⇔→←↔ x ∈ A, x 6∈ ∅ A∪B∩C ×D\E 1 2 x \in A, x \not\in \emptyset A \cup B \cap C \times D \setminus E 1 2 3 4 I \forall x. \exists y. x \mapsto y \neg A\vee B\wedge C \Rightarrow \Leftarrow \Leftrightarrow \rightarrow \leftarrow \leftrightarrow Character decorations I ~x, ẋ, x̄, 6= Comparing things 6<, ≤, ≥, ≡, ∼, 6∼ 1 \vec{x}, \dot{x}, \bar{x}, \not= 1 Stefan Klinger · DBIS \not<, \le, \ge, \equiv, \sim, \not\sim Key Competence in Computer Science · Winter 2015 71 4 · Scientific text processing with LATEX AM More math symbols · 4.11 S-LAT EX C, N0 , Q+ For more advanced math, you may want to use the AMS-LATEX packages provided by the American Mathematical Society. 1 2 3 4 5 6 7 I I \documentclass{article} \usepackage{amssymb} % Note this! \begin{document} \[ \mathbb C, \mathbb{N}_0, \mathbb{Q}^+ \] \end{document} AMS mathematical facilities for LATEX. http://ctan.org/pkg/amsmath Scott Pakin. The Comprehensive LATEX Symbol List. http://ctan.org/pkg/comprehensive Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 72 4 · Scientific text processing with LATEX 4.12 A stub for exercises · 4.12 A stub for exercises Basic Math Summer 1932 Solution of Exercise 1 Chestnut Barrow, Elizabeth Parker Assignment 1 Let r ≡ m mod n. With the assumptions given in the assignment, we have ∃x, y, z. n = t · x ∧ r = t · y ∧ m = z · n + r With these, we conclude m = z · t · x + t · y by using the equations for n and r, and then m = t · (z · x + y) ⇒ t|m. Let us define a stub that you can use to prepare your exercises. It should render the above output for a fictional Basic Math lecture held in the thirties. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 73 4 · Scientific text processing with LATEX 1 A stub for exercises · 4.12 \documentclass[a4paper]{article} 2 3 4 5 \usepackage[utf8]{inputenc} \usepackage{amssymb} \usepackage{a4wide} 6 7 8 9 \begin{document} % filled in on the next slides \end{document} Stefan Klinger · DBIS Option a4paper sets the paper size to DIN A4 instead of US letter. Package a4wide redefines the margins to utilize more of the available space. Key Competence in Computer Science · Winter 2015 74 4 · Scientific text processing with LATEX A stub for exercises · 4.12 Basic Math Summer 1932 Solution of Exercise 1 Chestnut Barrow, Elizabeth Parker 1 2 3 4 5 6 \begin{center} \textbf{Basic Math \hfill Summer 1932}\\ \bigskip \textbf{\LARGE Solution of Exercise 1}\\ \emph{Chestnut Barrow, Elizabeth Parker} \end{center} The header is set in a center environment. Note, how \hfill adds rubber space in the first line, and \bigskip introduces extra vertical space. I \textbf sets its argument in bold face. I The available font sizes are \tiny, \scriptsize, \footnotesize, \small, \normalsize (the default), \large, \Large, \LARGE, \huge, and \Huge. Their use extends until the end of the group. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 75 4 · Scientific text processing with LATEX Assignment 1 A stub for exercises · 4.12 Let r ≡ m mod n. With the assumptions given in the assignment, we have ∃x, y, z. n = t · x ∧ r = t · y ∧ m = z · n + r With these, we conclude m = z · t · x + t · y by using the equations for n and r, and then m = t · (z · x + y) ⇒ t|m. 1 \paragraph{Assignment 1} 2 3 4 Let \(r \equiv m~\textrm{mod}~n\). assignment, we have With the assumptions given in the 5 6 \[ 7 8 9 \exists x, y, z.~ \wedge \wedge n = t\cdot x r = t\cdot y m = z\cdot n + r \] 10 11 12 13 With these, we conclude \(m = z\cdot t\cdot x + t\cdot y\) by using the equations for \(n\) and \(r\), and then \(m = t\cdot (z\cdot x + y) \Rightarrow t|m\). \hfill\(\square\) Also on this slide: Use ~ to introduce spaces, e.g., around mod. The \square needs to be set in math mode. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 76 4 · Scientific text processing with LATEX 4.13 Formatting hints · 4.13 Formatting hints More on spaces Since dot . marks the end of a sentence, a slightly larger space is added than between words. Sometimes, this is not intended. end. end. end. end. Mr. Mr. Mr. Mr, Jack Jack Jack Jack 1 2 3 4 end. end. end. end. Mr. Jack Mr.\ Jack Mr.~Jack Mr, Jack \\ \\ \\ \\ % % % % second space too large better, allow line break correct form comma for reference Use backslash-space \ to insert a normal-width space which allows line breaks, and a tilde ~ for a non-breaking space. In math mode you can choose from the following spaces: q..a..b. .c. .d. .e. .f. .g. 1 q. \! .a. .b. \, .c. \: .d. \; .e. ~ .f. \ Stefan Klinger · DBIS .h. .i .g. \quad .h. \qquad .i Key Competence in Computer Science · Winter 2015 77 4 · Scientific text processing with LATEX Formatting hints · 4.13 Typesetting of function names Use clear formatting30 hints to convey a message! ax + f x a x + f y logy log y bary bar y maint main t 1 2 3 4 5 6 ax(y + 1) log(y + 1) bar(y + 1) main(t) ax(y + 1) log y bar y main t \begin{array}{lllll} ax + fx & a~x+f~y & ax(y+1) & \textsf{ax}(y+1) log y & log~y & log(y+1) & \textsf{log}~y bar y & bar~y & bar(y+1) & \textsf{bar}~y main t & main~t & main(t) & \texttt{main}~t \end{array} & & & & ax(y + 1) log y bar y main(t) \textrm{ax}(y+1) \textrm{log}~y \textrm{bar}~y \texttt{main}(t) I When do you see a function name, when a product or application? I Explicitly add space to separate variables. I Use \textrm or \textsf for math function names. I Use \texttt to refer to an implementation. 30 Most \\ \\ \\ of these function names are predefined in the amsmath packages. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 78 4 · Scientific text processing with LATEX 4.14 Defining Commands · 4.14 Defining Commands In the document preamble you can define your own commands. I \newcommand{\name}{definition} creates a new command \name using the definition. • Useful as a shorthand for frequently 1 2 used symbols. 3 • Use suggestive names that describe 4 5 concepts. 6 • If you later want to change the symbol 7 \documentclass{article} \usepackage{amssymb} \newcommand{\Nat}{\mathbb{N}} \newcommand{\union}{\cup} \begin{document} \[ x\in \Nat\union M \] \end{document} consistently in your document, it’s much simpler to change a definition. I x∈N∪M \newcommand does not allow you to redefine existing commands. You may do this with \renewcommand{...}{...} insted. (A bad idea unless you really know what you are doing!) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 79 4 · Scientific text processing with LATEX Defining Commands · 4.14 Sometimes you will want to pass arguments to a command to get something more flexible. I \newcommand{\name}[num]{definition} defines a command that consumes a number of arguments. (also works with \renewcommand) I Within the definition, you can refer to these arguments by their position as #1 trough #9 — so num must be ≤ 9. 1 \newcommand{\set}[2]{ \left\{ #1 \,\middle|\, #2 \right\} } 2 3 4 5 6 \begin{document} \[ \set{x}{x\in A,\, \exists y.~ y\in A \,\wedge\, y \ge x} \] \[ \set{\frac{2x+1}{y}}{x+y=33,\, y>0} \] \end{document} {x | x ∈ A, ∃y. y ∈ A ∧ y ≥ x} 2x + 1 x + y = 33, y > 0 y Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 80 4 · Scientific text processing with LATEX Defining Commands · 4.14 Environments With \newenvironment{name}[num]{begin}{end} you can even define your own environments. (there also is \renewenvironment) I begin is used before, end is used after the environment’s content. I The arguments #1 trough #9 can only be used for the definition in begin. 1 2 3 4 5 Text for Reference. 6 \newenvironment{mylist}{ \begin{list}{\(\Rightarrow\)}{ \itemsep -1ex \leftmargin 0ex} }{ \end{list} } 7 ⇒ Left margins are aligned. ⇒ Items are pretty dense. 8 9 10 11 12 13 14 \begin{document} Text for Reference. \begin{mylist} \item Left margins are aligned. \item Items are pretty dense. \end{mylist} \end{document} Also on this slide: The generic list environment. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 81 4 · Scientific text processing with LATEX 4.15 Maintaining a Bibliography · 4.15 Maintaining a Bibliography Here [Knu68] you should see a reference to the item below. If there is a questionmark, the reference was not resolved! With BibTEX, you can maintain a database of bibliographic references. I \bibliographystyle sets how references should be represented. • Try plain instead of alpha. • Publishers frequently require you to use a given style. I I References [Knu68] Donald E. Knuth. The Art of Computer Programming, Volume I: Fundamental Algorithms. Addison-Wesley, 1968. 1 3 4 5 6 \cite{key} generates a reference 7 8 to an item in the database, 9 10 identified by a key. \bibliography{file} creates the bibliography, based on the database in file.bib. cf. page 83 Stefan Klinger · DBIS \bibliographystyle{alpha} 2 \begin{document} Here \cite{knuth} you should see a reference to the item below. If there is a questionmark, the reference was not resolved! \bibliography{database1} \end{document} I Note: The key is “knuth” here. I Only cited references will be added to the document. Key Competence in Computer Science · Winter 2015 82 4 · Scientific text processing with LATEX Maintaining a Bibliography · 4.15 The BibTEX database 1 BibTEX uses it’s own format to 2 specify bibliographic data sets. 3 4 Togeter with a style specification, these are rendered into the references 56 7 that appear at the end of your 8 document. @book{knuth, author = {Donald E. Knuth}, title = {The Art of Computer Programming, Volume I: Fundamental Algorithms}, publisher = {Addison-Wesley}, year = {1968}, } I For this example, I’ve used a small BibTEX database31 with only one entry. Note the key “knuth” in the first line, identifying this entry. I When reading papers, you will probably aggregate your own collection of bilbliographic references. I Some web sites32 list publications with their BibTEX entries. You’d reather cut’n’paste these, than create your own. 31 https://svn.uni-konstanz.de/dbis/sq_15w/pub/latex-examples/database1.bib 32 e.g., http://www.dblp.org, or http://www.citeseer.com/, ... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 83 4 · Scientific text processing with LATEX Maintaining a Bibliography · 4.15 Making it I In general you will need multiple runs of pdflatex and bibtex in the order given below (biblio.tex33 is the document to be TEXed): 1 2 3 4 5 6 7 $ pdflatex biblio.tex # Among others, creates the biblio.aux file. # ... there is way more output here. Look out for this one: LaTeX Warning: There were undefined references. $ bibtex biblio.aux # Creates biblio.bbl, containing the bibliography. $ pdflatex biblio.tex # Adds the bibliography to the document. LaTeX Warning: There were undefined references. $ pdflatex biblio.tex # Finally, this gets the references right. I To see the effect, have a look at the generated biblio.pdf document after each run of pdflatex. I To ease the pain, there’s the script latexmk, automating this process: 1 $ latexmk -pdf biblio.tex # Repeatedly run bibtex and pdflatex as needed. You’ll find more documentation in the man page latexmk(1L). 33 https://svn.uni-konstanz.de/dbis/sq_15w/pub/latex-examples/biblio.tex Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 84 4 · Scientific text processing with LATEX 4.16 Document classes · 4.16 Document classes I Up to now, we have only used the article document class for examples and for the stub for exercises (cf. page 73). Other commonly used document classes are report and book. I Less mainstream but feature-rich document classes are I • KOMA-Script provides alternatives of the standard classes like scrartcl, scrreprt, and scrbook. They implement European (in particular German) typography conventions. • memoir — Can be used for books as well as for theses. Its manual has more than 600 pages. • beamer — Used to create slides (like this document). I I Sometimes a publisher will provide you with a document class that you have to use. There are lots of specialized classes like exam, minutes, moderncv, ...34 34 http://texcatalogue.ctan.org/bytopic.html#classes, Stefan Klinger · DBIS ...#theses, ...#present Key Competence in Computer Science · Winter 2015 85 4 · Scientific text processing with LATEX Document classes · 4.16 Differences These document classes mainly differ in two aspects35 : I Available commands and environments (hard to change): • book and report feature the \chapter, but article does not. • book and report start a new page for a \part, while article does not. • Only article and report offer the abstract environment. I Default settings (easy to change): • book uses the twoside option (different margins for even and odd pages, etc.), article and report default to oneside. • book and report enumerate figures, tables, etc. per chapter, while article does it continously. Which class? Select the document class accoring to your needs, use I article for a short paper, report for a longer document and book for very large documents. I These standard classes only provide the bare minimum. Have a look at the alternatives (cf. previous slide) for more features. 35 A nice overview is here: http://tex.stackexchange.com/questions/36988/ Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 86 4 · Scientific text processing with LATEX 4.17 Further reading · 4.17 Further reading Packages and extensions (CTAN is the Comprehensive TEX Archive Network) LAT I beamer – A EX class for producing presentations and slides. http://www.ctan.org/pkg/beamer I pgf – Create PostScript and PDF graphics in TEX. http://www.ctan.org/pkg/pgf Getting help (also cf. page 72) I Tobias Oetiker. The not so Short Introduction to LATEX. http://www.ctan.org/tex-archive/info/lshort/english/ I Helmut Kopka. LATEX: Eine Einführung. ISBN 3-89319-434-7. Hypertext Help with LATEX @ Goddard Institute for Space Studies. http://www.giss.nasa.gov/tools/latex/ I I I StackExchange on TEX. http://tex.stackexchange.com/ Examples of sophisticated diagrams made with PGF and TikZ. http://www.texample.net/tikz/examples/ Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 87 5 Some more shell basics 5 · Some more shell basics 5.1 Globbing · 5.1 Globbing The shell expands wildcard characters when they appear on the command line. This is called globbing in Unix jargon. 1 2 3 4 5 6 7 8 9 $ ls first.aux first.pdf second.aux second.pdf first.log first.tex second.log second.tex $ ls first* first.aux first.log first.pdf first.tex $ ls *tex first.tex second.tex $ ls *[ca]*[ax]* first.aux second.aux second.tex * Matches any sequence of characters. ? Matches any single character. [list] Matches any single character in the list. I You may specify ranges as in [0-9] instead of [0123456789]. I If the first character after [ is a ^, the matching is inverse. (For an in-depth description cf. bash(1), Pattern Matching) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 89 5 · Some more shell basics 1 2 3 4 $ ls *t test $ ls .*t .bash_logout Globbing · 5.1 I .lesshst Dotfiles are omitted from expansion, unless the dot is explicitly given in the pattern. I Note that the expansion is performed by the shell, not by ls (or any other command), so the command should not see the pattern. I However, if no matching files are found, the pattern is handed to the command unchanged. 1 2 3 $ ls *tex miss* ls: cannot access miss*: No such file or directory first.tex second.tex You can force bash to print an error message instead of running the command by setting the shell’s failglob option. I shopt [-s | -u] name 4 5 6 $ shopt -s failglob $ ls *tex miss* bash: no match: miss* Stefan Klinger · DBIS z Set, unset, or print shell option name. # bash complains, ls is not even executed Key Competence in Computer Science · Winter 2015 90 5 · Some more shell basics 5.2 Permanent settings · 5.2 Permanent settings I Settings like shopt -s failglob are lost when the shell terminates. I You can make changes permanent by creating a file ~/.bashrc, which is read and executed during shell startup. (cf. bash(1), Invocation) I Unfortunately, due to long-standing bugs in the bash-completion package, it is necessary to also disable “Programmable Completion”: 1 shopt -u progcomp # disable buggy programmable completion Later we will see another shell option that also triggers this bug. (Even better: If the box is under your control, uninstall bash-completion) I You may also want to disable History Expansion 1 set +H which makes ! behave funny # disable history expansion (or read bash(1), History Expansion) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 91 5 · Some more shell basics 5.3 Quoting · 5.3 Quoting I want to delete the empty files, the other two are important: 1 2 3 4 5 6 7 8 $ ls -l total 8 -rw-------rw-------rw-------rw------$ rm * ˆC $ rm hello 1 1 1 1 pop09951 pop09951 pop09951 pop09951 world ˆC domain_users 0 Feb 20 18:57 * domain_users 29 Feb 20 18:57 hello domain_users 0 Feb 20 18:57 hello world domain_users 29 Feb 20 18:57 world # Plan I: What would happen here? # Plan II: Is this any better? I How can we tell the shell that a space is part of an argument (e.g., a filename), instead of an argument delimiter? I How can we avoid expansion of wildcard characters? ⇒ Quoting is a mechanism to give some text a different meaning. 1 2 3 4 5 $ rm \* 'hello world' # the asterisk and the space appear quoted $ ls -l total 8 -rw------- 1 pop09951 domain_users 29 Feb 20 18:57 hello -rw------- 1 pop09951 domain_users 29 Feb 20 18:57 world Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 92 5 · Some more shell basics Quoting · 5.3 The shell knows three quoting mechanisms: (cf. bash(1), Quoting) I An unquoted backslash \ serves as the escape character, i.e., it preserves the literal value of the following character36 . I Single quotes '·' preserve the literal values of the enclosed characters. ⇒ The backslash does not work here ⇒ You cannot put a ' inside single quotes! I Double quotes "·" preserve the literal values of most of the enclosed characters. Exceptions are the double quote ", dollar $, backtick ‘, and backslash \, all of which can be escaped with a (now quoted) backslash. 1 2 3 4 5 6 $ * $ * $ * echo n \ echo said echo said \* \n \\ 7 8 '* said "hello"\n' "hello"\n "* said \"hello\"\n" "hello"\n 36 Exception: 9 10 11 12 $ echo single\'quote single'quote $ echo "single'quote" single'quote $ echo 'single'\''quote' single'quote \newline is removed completely instead of being replaced with a newline. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 93 5 · Some more shell basics I Quoting · 5.3 Think of ' and " as a quoting-toggles. 1 2 $ ls -l hel'lo wor'ld # unusual, but works -rw------- 1 pop09951 domain_users 0 Feb 20 13:49 hello world • The first ' switches strong quoting on, the second one switches it off. I Some metacharacters have a special meaning for the shell, and therefore must be quoted to get their literal value: ! " # $ & ' ( ) * ; < > ? [ \ ] ` { | } ~ Some of them do not always have a special meaning, but it’s save to always quote them. I Double quotes " are also known as weak quotes, since ! $ \ ` retain their special meaning. Single quotes ' are known as strong quotes. I Do not confuse the single quote ' (ASCII 0x27) with the back tick ` (ASCII 0x60, aka. backwards quote). Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 94 5 · Some more shell basics 5.4 Links, hard and soft · 5.4 Links, hard and soft A link is like a copy, but stays in sync with the original. I Soft links, aka. symbolic links, aka. symlinks, refer to a target specified by its path, which may be relative or absolute. • Symlinks can span different file systems. • A dangling symlink refers to a non-existing file. I A hard link is just a(nother) name in the file system for a file on disk. • You cannot distinguish the “original” from the link. • Removing the last hard link to a file removes the file. (That’s why deleting a file is frequently referred to as unlinking it) • You cannot hard-link directories, nor files on different file systems. ln [-s] source [dest] z Link file source to destination. I If dest is a directory, create link under it. I If dest is omitted, create link in current directory. I With -s, create symlink instead of hard link. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 95 5 · Some more shell basics Links, hard and soft · 5.4 Symlink example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 $ cat foo # assume we have only this file hello $ ln -s foo bar # create symlink bar $ ls -l total 4.1k lrwxrwxrwx 1 sk users 3 Oct 1 22:24 bar -> foo # note link flag, and arrow -rw------- 1 sk users 6 Oct 1 22:24 foo $ cat bar hello # so bar is dereferenced to foo $ nano bar # Edits to the symlink... $ cat foo # ...are applied to the target hello world $ rm foo # if you remove the source... $ cat bar cat: bar: No such file or directory # ...the symlink dangles $ ls -l total 0 lrwxrwxrwx 1 sk users 3 Oct 1 22:24 bar -> foo Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 96 5 · Some more shell basics Links, hard and soft · 5.4 Hard link example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 $ cat foo # assume we have only this file hello $ ln foo bar # create hard link bar $ ls -l total 8.2k -rw------- 2 sk users 6 Oct 2 23:19 bar # note the link count -rw------- 2 sk users 6 Oct 2 23:19 foo $ cat bar hello # foo and bar are links to the same file $ nano foo $ cat bar hello world $ rm foo # you can remove any one you like... $ cat bar # ...the other one is still there hello world $ ls -l total 4.1k -rw------- 1 sk users 12 Oct 2 23:20 bar # but the link count has changed Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 97 5 · Some more shell basics 5.5 Aliases · 5.5 Aliases Tired of typing ls -l all the time? alias [name[=value]...] 1 2 3 I 1 2 I I I builtin to define or display aliases. If the first word of a command is an alias, it is replaced with the corresponding definition. I I z Shell ~/foo $ alias ..='cd ..' ~/foo $ .. ~ $ # There are no spaces allowed around the equal sign. # replaced by cd .. Alias lookup is recursive, and automatically omits cycles: $ type ls # cf. page 30 ls is aliased to ‘ls --color=auto' 1 2 3 $ alias l='ls -l' $ l # colored output here Try set -x to see how the shell expands a command. Functions are even more versatile, e.g., you may pass arguments. To make aliases permanent, cf. page 91. For more on aliases, and functions, cf. bash(1), as usual. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 98 6 Text Editors http://xkcd.com/378/ 6 · Text Editors A painter’s brush computer scientist painter = text editor brush Editing text is probably the most common interactive task: programming I writing papers I building presentations I writing email, letters, web pages, ... I controlling computations (shell scripting, R, ...) I querying databases (hacking XQuery, SQL, Prolog,...) I Choose your editor well... ...a separate editor for each task? I ...or one to rule them all? E.g., the nano editor emerged from pico, which was originally part of the pine mail user agent. I Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 100 6 · Text Editors Features a text editor typically provides I Modifying plain text • • • • I load & save files cut & paste search & replace different text encodings (e.g., ASCII, UTF-8, Latin-1, ...) Assisting with programming tasks • syntax highlighting • integration with developement tools (compiler, debugger) I Opportunities for configuration and extension • rebind keys • scriptable/programmable Features typically not available in a text editor: I Formatting, as in “make selected text bold, underlined, red, ...” I Page layout, including graphics, diagrams, ... I Everything you’d expect from WYSIWYG DTP tools. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 101 6 · Text Editors The two wizards The term editor war refers to the rivalry between the emacs and vi families of editors, sometimes assuming features of a religious dispute. ⇒ amlost only a matter of personal choice. Disclosure: I am non-ideologically biased towards GNU emacs! Considerations I Where will I be editing? Is my editor likely to be available there? • There’s probably no Eclipse on a dedicated network switch. I I What’s my mode of connection? Is a GUI interface feasible at all? What will I be editing? Is there any support from the editor? • text encoding, eol-convention, available fonts (for Unicode) • syntax highlighting, spell checking, ... • integration with developement tools (compiler, debugger, linter, ...) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 102 6 · Text Editors vim, 1991 by Bram Moolenaar · 6.1 6.1 vim, 1991 by Bram Moolenaar I On the web: http://www.vim.org/. I Documentation: http://vimdoc.sourceforge.net/. Designed as an extension of vi (1976 by Bill Joy37 ), but with a different code base. I • • • • which started as a visual mode for ex, a line editor vi is standardised in Unix specifications. There are many re-implemenations of vi. vim is often referred to as vi, on some platforms vi is an alias for vim. (On titan07 it is a symlink! — This is quite usual) I Scriptable with vimscript. I Plethora of extensions and scripts available. I Baffling concept: Different Modes. 37 Co-founder Stefan Klinger · DBIS of Sun Microsystems Key Competence in Computer Science · Winter 2015 103 6 · Text Editors vim, 1991 by Bram Moolenaar · 6.1 vim [file] z Run the vim editor (opening file if given). Depending on the installation, typing vi may launch vim instead. vimtutor z Runs vim with a tutorial loaded. Try this, and work through a guided tour of basic commands. I Leave vim or vimtutor, not saving anything: 1. Hit Esc (maybe repeatedly) to enter Normal Mode. 2. Type :qa! Return. I Getting help: Hit F1, or type :help Return. Exercise(vi) 1. Spend some time with the tutorial. Do not try to find out by trial and error how vi works. 2. For at least one month, try to use only vi for all text editing tasks. Try hard (i.e., read the manual, search the web) to solve all problems you encounter. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 104 6 · Text Editors 6.2 I I GNU emacs, 1976 by Richard Stallman · 6.2 GNU emacs, 1976 by Richard Stallman 38 On the web: http://www.gnu.org/software/emacs/emacs.html. Initially implemented in TECO macros • Later versions were written in Lisp, • and today a combination of C and Emacs Lisp. • Many reimplementations and forks, most notably XEmacs. Where the X does not stand for X11 graphics. I Scriptable with Emacs Lisp. A dialect that differs a lot from other Lisps. I Plethora of extensions and scripts available. I Baffling concept: Key Sequences. 38 Founder of the Free Software Foundation and the GNU Project, started its C compiler and debugger, ... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 105 6 · Text Editors emacs [file] GNU emacs, 1976 by Richard Stallman · 6.2 z Run the GNU emacs editor (opening file if given). emacsclient [options] file z Tries to connect a running GNU emacs server process, and make it open a file. With -a '', launches a server if none is running. cf. emacsclient(1). I Leave emacs, not saving anything: 1. Hit C-g (maybe repeatedly) to cancel any incomplete operation. 2. Hit C-x C-c or type M-x kill-emacs. (Depending on your terminal, M-x may be Esc x) I Getting help: Hit F1 r for the the manual, or F1 t to run the tutorial. Exercise(emacs) 1. Spend some time with the tutorial. Do not try to find out by trial and error how emacs works. 2. For at least one month, try to use only emacs for all text editing tasks. Try hard (i.e., read the manual, search the web) to solve all problems you encounter. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 106 7 Pipelines https://en.wikipedia.org/wiki/File:Pipes_various.jpg 7 · Pipelines 7.1 Standard streams · 7.1 Standard streams Each Unix process initially is connected to three data streams: stdin standard input to read data. This is used to get your input, just like the shell does. stdout standard output to write data. This is where a program prints its output, like ls does. stderr standard error to write error messages. Also shows up in your terminal, interleaved with stderr. E.g., cat(1) copies stdin to stdout if no arguments are given. Pressing C-d signals end of input39 . 1 2 3 4 $ cat hello # user input → stdin hello # printed by cat → stdout # Press C-d on an empty line to exit Redirection allows you to reassign source or target of a stream. I Input can originate from files, or the output of other processes. I Output can be saved to a file, or fed to the input of another process. 39 http://unix.stackexchange.com/questions/110240/why-does-ctrl-d-eof-exit-the-shell Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 108 7 · Pipelines Standard streams · 7.1 Redirecting output 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 $ ls # no files here $ date Mon Sep 30 23:10:16 CEST 2013 $ date > output $ ls output # new file $ cat output # contains output Mon Sep 30 23:10:21 CEST 2013 $ echo gone > output $ cat output gone # overwritten $ echo append >> output $ cat output gone # still there append 40 i.e., date z just prints the current time and date, cf. date(1). I The greater-than > redirects stdout of the command date to the file output. I If you redirect to an existing file, it is truncated40 first. I Use >> instead of > to append to a file. its content is deleted, subtly different from deleting the file and creating a new one. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 109 7 · Pipelines I Standard streams · 7.1 To avoid overwriting of an existing file with >, you may set the shell’s noclobber option. 1 2 3 4 5 6 7 8 $ echo hello > output # (over)write file output $ set -C # set the noclobber option $ echo world > output # try overwriting... -bash: output: cannot overwrite existing file # ... fails $ echo world >> output # appending is fine $ echo bye >| output # force overwriting with >|... $ cat output # ... works bye You can make this permanent by adding set -C to your ~/.bashrc. I You may redirect to the null device /dev/null, to efficiently get rid of unwanted data. “Talk to /dev/null” is a nerd’s insult. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 110 7 · Pipelines Standard streams · 7.1 Redirect stderr 9 10 11 12 I I $ ls missing exists > output # only the file exists does exist ls: cannot access missing: No such file or directory $ cat output exists The error message is not written to file output, because stderr is not redirected. The three streams are associated with file descriptors: cf. open(2) 0 ← stdin I 2 → stderr The >file is an abbreviation for 1>file, which simply changes the association to 1 → file. Using this, redirecting stderr to a file is easy. 1 2 3 4 I 1 → stdout $ ls missing exists 2> output exists $ cat output ls: cannot access missing: No such file or directory As usual, 2>> appends to file, 2>| forces overwriting (if set -C). Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 111 7 · Pipelines Standard streams · 7.1 Multiple redirects 1 2 3 4 5 6 7 8 9 10 11 12 13 $ ls missing exists 1> out 2> err $ cat err ls: cannot access missing: No # ... $ cat out exists $ ls missing exists 1> out 2> out $ cat out exists t access missing: No such file # ... $ ls missing exists 1> out 2>&1 $ cat out ls: cannot access missing: No # ... exists I Straight forward: Redirecting the two output streams to different files. I Associating both file descriptors with the same file does not work. ⇒ Data loss! I m>&n redirects file descriptor m to the target of fd n. Note: There must be no space before the ampersand, i.e., not 2> &1 So 2>&1 means: What’s written to fd 2 goes to where fd 1 is currently directed, which is file out in this case. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 112 7 · Pipelines Standard streams · 7.1 Order of redirections 1 2 3 4 $ ls miss exists >file 2>&1 $ cat file ls: cannot access miss: No such file# ... exists 1 2 First 1>file, then 2>&1. 1 2 3 4 $ ls miss exists 2>&1 >file ls: cannot access miss: No such file# ... $ cat file exists 1 2 First 2>&1, then 1>file. file file 1 fd 1 2 stdout fd 1 stderr fd 2 stdout 2 1 fd 2 I stderr Note: Redirections are applied in the order they are specified! Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 113 7 · Pipelines Standard streams · 7.1 Sources for stdin Ummm..., how many LATEX-files do I have in this directory? 1 2 3 4 5 6 7 8 $ ls *.tex >out $ wc -l out 12 out # for each file $ wc -l < out 12 # no filename $ rm out # do not forget $ ls *.tex | wc -l 12 # no filename wc [-l | -w | -c] [file...] z count words (-w), lines (-l), or bytes (-c) for each file, or for stdin if no files given. I <file sends file’s contents to stdin of the command. I The | builds a pipeline of commands: stdout of the command on the left is connected to stdin of the command on the right. I You can connect multiple commands to a long pipeline. I Note that each command in a pipeline can have its own redirections! 1 $ foo 2>foo.log | bar 2>&1 1>/dev/null | wc -l Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 # what would that do? 114 7 · Pipelines 1 2 3 4 5 6 7 Standard streams · 7.1 $ wc <<< 'hello world' 1 2 12 $ wc <<stop > Lorem ipsum dolor sit amet, > consectetur adipisicing elit, > stop 2 8 58 I A string can be fed to a command via stdin by the <<< operator. This is known as here string. I A similar concept, here documents, uses a key (here: stop) to delimit the data to pass: • <<delim passes the following lines to stdin of the command, • up to a line conatining only the delimiter (no leading spaces). I There are many more ways of redirection, we’ll see some of them later in this course. I For an in-depth discussion, cf. bash(1) Redirections. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 115 7 · Pipelines 7.2 The usual suspects in a pipe · 7.2 The usual suspects in a pipe cat [file...] z Concatenate all files and print to stdout. head [-n n] [file...] z Print the first 10 lines of of each file. With -n, print n lines, with a negative argument, print all but the last n lines. tail [-n n] [file...] z Like head, prints the last 10 (or n) lines. Use +n to print lines starting with the nth. tee [-a] file... z Copy stdin to all files, and to stdout. With -a, append to, instead of overwrite files. less [file] I z Browse stdin if no file is given. Where applicable, these programs accept a single hyphen - as “filename” to read from stdin. (e.g., to use ls -l | cat header.txt - footer.txt) I The same tools read from stdin, if no input files are given. I These conventions are quite handy, and many other tools follow them. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 116 7 · Pipelines 7.3 Sorting · 7.3 Sorting Task What are the five biggest files/directories in /etc/? du -s [-h] file... z Summarise disk usage for each file (or directory). Adding -h prints “human readable” numbers. sort [options] [file...] z Write a (line-wise) lexicographically sorted concatenation of all files to stdout. With options: -n -h -M -r -s I compare number at start of line compare human-readable number at start of line compare month name abbreviation (e.g., Jan) at start of line reverse sort order stable sort, i.e., keep lines in order if they compare equal. Main memory limits length of longest line, disk space limits size of files to sort. ⇒ Suitable for very large files. (cf. du(1) and info sort) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 117 7 · Pipelines Sorting · 7.3 8 9 10 11 12 13 Stefan Klinger · DBIS $ du -s -h /etc/* 2>/dev/null | sort -h -r | head -n 5 17M /etc/skel 6.5M /etc/mateconf 3.7M /etc/brltty 1.2M /etc/bash_completion.d 980K /etc/ssl Key Competence in Computer Science · Winter 2015 118 7 · Pipelines Sorting · 7.3 Remark on options and arguments I A program sees its command-line arguments as a list of words. It is at its own discretion how to interpret them! Always refer to the manual. I POSIX convention suggests the following: • Options consist of a hyphen plus one additional character, like ls -l. • If the option takes an argument, the space in between may be omitted: head -n 5 ≡ head -n5 • Options that do not take arguments may be combined: du -s -h ≡ du -sh • A single hyphen refers to the according standard I/O stream. • There are only operands after a double hyphen --. E.g., to create, list, and remove a file named -l. I Most GNU tools additionally recognise long options of the form --long-name[=value]. (cf. http://www.gnu.org/software/libc/manual/html_node/Argument-Syntax.html) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 119 7 · Pipelines Sorting · 7.3 14 15 16 17 18 19 Stefan Klinger · DBIS $ du -sh /etc/* 2>/dev/null | sort -hr | head -n5 17M /etc/skel 6.5M /etc/mateconf 3.7M /etc/brltty 1.2M /etc/bash_completion.d 980K /etc/ssl Key Competence in Computer Science · Winter 2015 120 7 · Pipelines Task Sorting · 7.3 What’s the file system with the most free space? df [-h] z Report used and free disk space for each file system. Report sizes in human-readable form with -h. (cf. df(1)) sort can extract sort keys from sections of the input lines. I By default, fields are separated by the empty string between a non-blank character and a blank character. I Option -k start[,end][arg] uses the indicated area as sort key. • start and end refer to field numbers. If omitted, end is end of line. • Fields are counted from 1. • The arg specifies how to interpret the field, e.g., one of n numeric, h human-readable numeric, M month abbreviation, maybe r reverse, and others. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 121 7 · Pipelines Sorting · 7.3 1 2 Stefan Klinger · DBIS $ df -h | tail -n+2 | sort -k4hr | head -n1 /dev/sda1 434G 21G 391G 6% / Key Competence in Computer Science · Winter 2015 122 7 · Pipelines Sorting · 7.3 Task Look at the Marathon year rankings file41 . Cut off the header, and sort it by year. 1 2 3 4 5 $ head marathon #title: Marathon year rankings #source: https://en.wikipedia.org/wiki/Marathon_Year_Rankings #date retrieved: Wed 2013-Apr-03 14:26:41 CEST #key: sex, time, athlete, athlete's nationality, date, city, country 6 7 8 9 10 11 M, M, M, M, M, 2:30:57.6, Harry Payne, GBR, 1929-07-05, Stamford Bridge, England 2:5:42, Khalid Khannouchi, MAR, 1999-10-24, Chicago, USA 2:5:37.8, Khalid Khannouchi, USA, 2002-04-14, London, UK 2:4:48, Patrick Makau Musyoki, KEN, 2010-04-11, Rotterdam, Netherlands 2:10:47.8, Bill Adcocks, ENG, 1968-12-08, Fukuoka, Japan I Patrick Makau Musyoki ⇒ varying number of white spaces before interesting field! I You may chose any other delimiting character c with the -tc option. I For testing, use the --debug option of sort. 41 https://svn.uni-konstanz.de/dbis/sq_15w/pub/marathon Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 123 7 · Pipelines Sorting · 7.3 Testing: 1 $ tail -n+6 marathon | head -n20 2 $ tail -n+6 marathon | sort --debug -k6n | head -n20 # fails 3 $ tail -n+6 marathon | sort --debug -k7n | head -n20 # fails 4 $ tail -n+6 marathon | sort --debug -t, -k5n | head -n20 Solution: 5 $ tail -n+6 marathon | sort -t, -k5n Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 124 7 · Pipelines Sorting · 7.3 Further remarks I There’s nothing wrong using multiple sort commands in one pipeline. I A stable sort (-s) does not reorder lines that compare equal wrt. the specified keys. I The option -u (unique) only lists the first (incoming) result for all lines that compare equal wrt. the specified keys. More commands uniq [input [output]] z Filter adjacent matching lines from input (or stdin) to output (or stdout). I The whole line is compared, in contrast to sort -u, which compares only the key. I uniq can count the number of matching lines. shuf [-n n] [file] z Write a random permutation of the input lines to stdout. If specified, only print the first n lines. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 125 7 · Pipelines Sorting · 7.3 Question What is happening here? 1 2 3 4 5 6 $ ls *.jpeg christchurch.jpeg lake_te_anau.jpeg clearstream.jpeg lava_desert.jpeg farewell_spit.jpeg snowy_creek.jpeg $ ls *.jpeg | shuf -n1 tama_lakes.jpeg Stefan Klinger · DBIS tama_lakes.jpeg the_gap.jpeg whale.jpeg Key Competence in Computer Science · Winter 2015 wool.jpeg 126 7 · Pipelines 7.4 Filtering · 7.4 Filtering Another very common task is to find all lines containing a certain string: grep -F [-v] [-i] 'string' [file] z Print all lines from file (or stdin) that contain the string. With -i, ignore case. With -v, invert the matching, i.e., print non-matching lines. To show all entries about Patrick: 1 2 3 $ grep -F -i patrick marathon M, 2:4:48, Patrick Makau Musyoki, KEN, 2010-04-11, Rotterdam, Netherlands M, 2:3:38, Patrick Makau Musyoki, KEN, 2011-09-11, Berlin, Germany Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 127 8 Regular Expressions 8 · Regular Expressions 8.1 I I Regular Expression basics · 8.1 Regular Expression basics Searching for fixed strings is boring, i.e., not versatile enough. Find strings that feature a certain pattern: • Integral numbers (e.g., 123, not hello) • Times (e.g., 6:45pm, not 123) • Dates (e.g., 2012/Dec/21, not 6:45pm) Regular Expressions I describe strings. (Commonly abbreviated RE, rex, or regex) If a string s satisfies the description by a regular expression e, we say that s matches e, more formally, s ∼ e. (Or the other way round?) • 2012/Dec/21 ∼ “description of a date” • 2012/Dec/21 ∼ 6 “description of an integer” I The strings we want to describe are made of characters, drawn from a finite Alphabet A. For simplicity, instead of all possible characters, assume A = {a, ..., z, A, ..., Z, 0, ..., 9, /, :, ,, -} Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 129 8 · Regular Expressions Regular Expression basics · 8.1 I Any single character a ∈ A forms a Regular Expression, that is matched by exactly the string a. These are the simplest REs. I If e1 and e2 are REs, then so is their concatenation e1 e2 . It is matched by exactly the strings that are a concatenation of strings s1 and s2 , with s1 ∼ e1 , and s2 ∼ e2 . a, n, s ∈ A ⇒ a, n, s are REs ⇒ an, ana, ..., ananas are REs ananas anna ∼ 6 ∼ ananas ananas I (Up to now, we can only describe fixed strings, like ananas.) I If e1 and e2 are REs, then so is their alternative e1 |e2 . It is matched by exactly the strings that match e1 or e2 . plum, cherry are REs ⇒ plum|cherry is a RE plum cherry plerry Stefan Klinger · DBIS ∼ ∼ 6 ∼ plum|cherry plum|cherry plum|cherry Key Competence in Computer Science · Winter 2015 130 8 · Regular Expressions I Regular Expression basics · 8.1 Concatenation binds tighter (aka. has higher precedence) than alternative. 2011 2 I 2010|1|2|3 2010|1|2|3 If e is a RE, then so is (e). Use this for grouping expressions, i.e., to override precedence: 2011 2 I 6∼ ∼ ∼ 6∼ 201(0|1|2|3) 201(0|1|2|3) If e is a RE, then so is its repetition e ∗ . It is matched by exactly the strings that are a concatenation of any number of strings all matching e. 0|1|2|3|4|5|6|7|8|9 is a RE ⇒ (0|1|2|3|4|5|6|7|8|9)∗ is a RE 23 0x17 027 Stefan Klinger · DBIS ∼ 6 ∼ ∼ ∼ (0|1|2|3|4|5|6|7|8|9)∗ (0|1|2|3|4|5|6|7|8|9)∗ (0|1|2|3|4|5|6|7|8|9)∗ (0|1|2|3|4|5|6|7|8|9)∗ Key Competence in Computer Science · Winter 2015 131 8 · Regular Expressions I Regular Expression basics · 8.1 It is less confusing to give the empty string a name: ε, or null string. The RE matched by exactly the empty string is (), i.e., empty, and frequently also called ε, or null. ε x ε ε I ∼ 6 ∼ 6∼ ∼ () () x (0|1|2|3|4|5|6|7|8|9)∗ Note: ε is not whitespace! • The space character is not even in our toy alphabet: 6∈ A. • If it was, i.e., ∈ A, then still 6∼ (), because a string that contains a whitespace is not empty. Exercise Describe the Integers Z, disallow leading 0 if the value is not zero, and disallow empty strings. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 132 8 · Regular Expressions Regular Expression basics · 8.1 ∗ 0 (-|) 1|2|3|4|5|6|7|8|9 0|1|2|3|4|5|6|7|8|9 I At this point, you know all the building blocks of Regular Expressions. I You will come across Regular Expressions quite often during your studies, and notation may vary. The basics are the same, though. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 133 8 · Regular Expressions 8.2 Syntactical sugar · 8.2 Syntactical sugar There is a lot of syntax sugar that makes writing easier, but does not extend the power of REs: (Let e be a RE, n ∈ N, and c1 , ..., cn ∈ A) I e ? := e|() — matching e is optional. I e + := ee ∗ — requires at least one match of e. I e n,m — e is matched at least n times, at most m times. I . — the period (wildcard) is matched exactly by any single character. I [c1 c2 ...] ≡ c1 |c2 |... — matched exactly by any single character in the character set {c1 , c2 , ...}. Exactly single lower-case vowels match [aeiou]. I [c1 − c2 ] ≡ c1 |...|c2 — matched exactly by any single character in the character range c1 ...c2 . (A must be ordered with c1 < c2 to make any sense) Exactly single upper-case letters match [A − Z]. I [ˆ...] — matched by any character not in the described set, or range. Exercise Describe the Integers Z, as before. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 134 8 · Regular Expressions Syntactical sugar · 8.2 0|-? [1 − 9][0 − 9]∗ I There’s a lot more syntax sugar and notation to learn. But let’s get used to Regular Expressions first... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 135 8 · Regular Expressions 8.3 grep — a regex filter · 8.3 grep — a regex filter grep -E [-x] 'regex' [file] z Print all lines from file (or stdin) that contain a string matching regular expression regex. With -x, only print lines that match in their entirety. (cf. info grep, and regex(7)) Print all lines in the file conatining ARG, BRA, or NZL. I 1 2 3 4 5 Any word with two b characters in it? I 1 2 3 4 I $ grep -E 'ARG|BRA|NZL' marathon # cf. page 123 for the marathon file F, 2:26:47, Allison Roe, NZL, 1981-04-20, Boston, USA M, 2:33:19, Juan Zabala, ARG, 1931-10-28, Košice, SVK M, 2:14:4.8, Michael Ryan, NZL, 1966-11-27, Fukuoka, Japan M, 2:6:5, Ronaldo da Costa, BRA, 1998-09-20, Berlin, Germany $ grep -E 'b[^ ]*b' marathon M, 2:12:11.2, Abebe Bikila, ETH, 1964-10-21, Tokyo, Japan M, 2:40:48.6, Charles Robbins, USA, 1944-11-12, Yonkers, New York, USA M, 2:15:16.2, Abebe Bikila, ETH, 1960-09-10, Rome, Italy Print only female runners. Are these correct? 1 $ grep -E 'F' marathon Stefan Klinger · DBIS 1 $ grep -E 'F,' marathon Key Competence in Computer Science · Winter 2015 136 8 · Regular Expressions grep — a regex filter · 8.3 Anchoring grep returns all lines that contain a matching string somewhere. I Option -x returns only lines that match in their entirety. I ^ matches the empty string at the beginning of a line. I $ matches the empty string at the end of a line. I \< matches the empty string at the beginning of a word. I \> matches the empty string at the end of a word. 1 2 3 All female runners: 1 $ grep -E '^F' marathon 4 5 6 7 8 9 Stefan Klinger · DBIS $ cat food baked apple applepie apple turnover pineapple $ grep -E 'apple\>' food # What do you expect here? $ grep -E 'apple$' food # And here? Key Competence in Computer Science · Winter 2015 137 8 · Regular Expressions grep — a regex filter · 8.3 Syntax You may have noticed that the characters we use for the alphabet in the theory section, are disjoint from the operators: A ∩ {|, ∗ , (, ), ...} = ∅ I In reality, the alphabet does contain characters like |, *, (, ), ... I What if we need to search for parenthesis? ⇒ We need to distinguish alphabet and operators: {|, *, (, ), ...} ∩ {|, ∗ , (, ), ...} = ∅ | {z } | {z } alphabet Stefan Klinger · DBIS operators Key Competence in Computer Science · Winter 2015 138 8 · Regular Expressions grep — a regex filter · 8.3 Characters and Metacharacters Characters that represent an operation are called metacharacters. They may be escaped with a backslash \ to get the literal character. RE operator ∗ + ? | (e) n,m Stefan Klinger · DBIS meta . * + ? | (e) {n,m} [c1 ...cn ] ^, $, \ grep -E literal \. \* \+ \? \| \(e\) [{]n,m}, \{n,m} \[c1 ...cn ] \^, \$, \\ To look for the literal string “[foo*]” I 1 $ grep -E '\[foo\*]' Note that the backslash needs to be escaped if used literally. I 1 $ grep -E -x "\\\\" What does this match? Key Competence in Computer Science · Winter 2015 139 8 · Regular Expressions grep — a regex filter · 8.3 grep’s argument -E stands for extended REs, as distinct from basic REs (without -E). -P switches to Perl Compatible Regular Expressions42 . RE operator ∗ + ? | (e) n,m I grep -E meta literal . \. * \* + \+ ? \? | \| (e) \(e\) {n,m} \{n,m} [c1 ...cn ] \[c1 ...cn ] ^, $, \ \^, \$, \\ grep (i.e., basic RE) meta literal . \. * \* \+ + \? ? \| | \(e\) (e) \{n,m} {n,m} [c1 ...cn ] \[c1 ...cn ] ^, $, \ \^, \$, \\ Option -F makes grep scan for a fixed string instead of a regular expression. 42 For more on PCREs, cf. pcrepattern(3) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 140 8 · Regular Expressions grep — a regex filter · 8.3 Interesting fact There is no way to check for the correct usage of parenthesis in general (i.e., unlimited but finite depth of nesting): I Regular Expressions cannot count. (You’ll learn this in your theoretical CS lectures) I Hence it is not possible to count the number of opening parenthesis. I So if we see a ), we cannot know whether there have been enough (s before that. I This applies to all kinds of nesting (e.g., XML tags). Further RE syntax sugar (may not be available on all platforms, e.g. on Mac) I \w matches a word constituent, \W matches a non-word constituent. I \s matches a whitespace, \S matches a non-whitespace. Exercise Look at the first example on page 13, and find out how it works. You probably have to look at some man pages as well. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 141 9 sed $ ls | sed -rn '/ /{h;s/ +/_/g;x > s/.*/mv -n "&"/;G;s/\n/ "/;s/$/"/;p}' | sh “If it was hard to write, it should be hard to read” — unknown 9 · sed 9.1 What is sed? · 9.1 What is sed? sed [-r] [-n] [-i] script [file...] z Perform scripted edits on the lines read from input files, or stdin. With -r, use extended regular expressions. Unless -n is given, print every line to stdout. Sed is a stream editor (typically used in pipes). I A sed script specifies which operations to perform on each line. • Special script language to specify edit operations. I It makes one pass over the input data (stdin or from a file). 1. 2. 3. 4. I Read one line (into what’s called the pattern space). Run the sed-script on the pattern space. Print the result (unless -n is given). goto 1. The most common use case: Search and replace a regular expression. A lot of people only know sed’s substitute comand. Detailed documentation: sed(1), and info sed Good tutorial: http://www.grymoire.com/Unix/Sed.html Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 143 9 · sed Substitute — s · 9.2 Substitute — s 9.2 I The substitute command s/regex/replacement/flags replaces the first match of regex in each line with the replacement text. I The separator / may be uniformly replaced by any other character. Whatever you choose, it must be escaped with a \ to appear inside the regex or replacement. I With flag g, sed replaces every occurrence in a line43 . 1 $ sed -r 's/, */;/g' marathon # change field delimiters to ; If you give a number n instead, only the nth match is replaced. I With flag p, sed will print a line right after the substitution occured. Useful with -n, and with multiple commands in the sed-script, cf. page 145. Tasks 1. Delete everything before the name of a runner. 2. Only print the names of the runners. 43 To omit writing tail -n+6, we assume the header has been removed, cf. page 149. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 144 9 · sed Substitute — s · 9.2 1. Remove 2 ,-separated fields. 1 $ sed -r 's/([^,]*, *){2}//' marathon 2. Use the previous solution, and strip everything after the next , 1 $ sed -r 's/([^,]*, ){2}//' marathon | sed 's/,.*//' You may specify multiple commands to be applied to each line of input. I Separate them with a newline, or a semicolon ; I The commands are applied in the order given. 1 1 2 $ sed -r 's/([^,]*, ){2}//;s/,.*//' marathon $ sed -r 's/([^,]*, ){2}// > s/,.*//' marathon Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 # press Return here 145 9 · sed Substitute — s · 9.2 Reusing matched text I Within the replacement text, you may use \n to refer to the portion of the match which is contained in the nth pair44 of parenthesis. • Another solution of the previous task thus is: 1 I $ sed -r 's/([^,]*, ){2}([^,]*),.*/\2/' marathon The special character & in the replacement text inserts the portion of the input line matched by the whole regular expression. • What does the following command do? 1 $ sed -r 's/[0-9]+/(&)/5' marathon Of course, & may be escaped, as in \&. 44 counted by the opening parenthesis Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 146 9 · sed Substitute — s · 9.2 Real world example 1 2 3 4 5 $ ls Abstract_Colors_Twirls.jpg Clown_Fish.JPG Dark_Sunset.jpeg White_owl.jpg Forest_Landscape.jpeg hüpfer.jpeg Quiet_Fields.jpg Wild_ducks.JPG For all files in the current directory, change the suffix from .jpg to .jpeg. Technique: First generate the commands to get the job done... 6 7 $ ls | sed -rn 's/(.*)\.jpg$/mv -n & \1.jpeg/p' $ ls | sed -rn 's/(.*)\.jpg$/mv -n & \1.jpeg/pi' # i-flag: ignore case ...then execute these commands! Idea: Pipe them into a shell: 8 $ ls | sed -rn 's/(.*)\.jpg$/mv -n & \1.jpeg/pi' | sh sh z is the standard command language interpreter. On many systems, this may be just bash itself, or a leaner (faster, less comfortable) yet compatible (POSIX conformant) interpreter, e.g., dash. (We’ll see more shell scripting later in this course...) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 147 9 · sed Substitute — s · 9.2 Addressing lines I s is just one of the many commands known to sed. Most of the other commands only make sense when restricted to few lines. • d — “Delete a line”, when applied to all lines leaves nothing. I You can prefix each command with an address, which limits the command to act on addressed lines only. • Without address, all lines are affected. I Addresses can be: n A line number n, or $ for the last line. /regex/ A regex selects only lines containing a matching string. start,end A range selects all lines from start to end inclusively. The limits can be line numbers, or REs. addr! Select all but the addressed lines. Task Delete lines 1–5 of the marathon file. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 148 9 · sed Substitute — s · 9.2 1 $ sed 1,5d marathon Inplace edit With command line argument -i[suffix], it is possible to modify the input file, i.e., sed will replace the original with the result. If a suffix is given, a backup will be made — overwriting any existing backup! Use carefully! 1 2 $ sed -i 1,5d marathon $ head marathon # check out the result Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 149 9 · sed 9.3 How sed digests its input · 9.3 How sed digests its input run script 2 input 1 read pattern space 2 a simplified picture! 1 2 3 output print exchange hold space Read one line into the pattern space (replacing its contents). Apply the whole script to the pattern space (restarted each cycle): • Modify the pattern space contents (e.g., s). • Maybe exchange data with the hold space. • ... 3 Flush pattern space to stdout, unless suppressed with the -n option. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 150 9 · sed 9.4 Command overview · 9.4 Command overview d z delete pattern space, skip following commands, start next cycle. p z print current contents of the pattern space. May occur multiple times, each printing current state. n z print pattern space (unless -n is given), then replace it with next line of input. Do not restart script, rather continue with next command. {commands} z Group of ;-separated commands (useful: addressing). s/regex/replacement/flags z substitution, cf. page 144. h / H z save pattern space to hold space / append to the hold space (with a newline in between) instead of overwriting. g/G z get hold space into pattern space / append instead, as above. x z exchange q z print pattern space and hold space. pattern space (unless -n is given), then quit. Many more commands, e.g., case conversion, file access, flow control, ... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 151 9 · sed Command overview · 9.4 Complex example In all filenames in the current directory, replace sequences of spaces with an underscore _. 1 2 3 4 5 6 7 8 9 $ ls Abstract Colors Twirls.jpg Forest Landscape.jpeg White owl.jpg Clown Fish.JPG hüpfer.jpeg Wild ducks.JPG Dark Sunset.jpeg Quiet Fields.jpg $ ls | sed -rn "/ /{h;s/ +/_/g;x;s/.*/mv '&'/;G;s/\n/ /;p}" | sh $ ls Abstract_Colors_Twirls.jpg Forest_Landscape.jpeg White_owl.jpg Clown_Fish.JPG hüpfer.jpeg Wild_ducks.JPG Dark_Sunset.jpeg Quiet_Fields.jpg Using the hold space is required here, since we need to maintain a copy of the original filename. Note for Apple-Users Your version of sed uses -E instead of -r for extended REs. Also, it does not understand \w, \W, \s, nor \S. See the manual if in doubt. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 152 10 Shell Scripting #! 10 · Shell Scripting 10.1 I Script files · 10.1 Script files Sooner or later you’ll find yourself typing the same commands over and over again45 . 1 2 3 I $ cd ~/studium/sq_15w $ svn up pub $ svn up foobar Of course, this can be automated. • • • • Any sequence of commands can be put in a script file. Running the script executes the commands therein. You can pass arguments to the script, just like to any other command. The shell offers control structures like conditionals (if/then) or loops (for, while). • You can assign to, and use variables. 45 This only works if you use a directory structure where your working copy of the SQ public repository is in ~/studium/sq_15w/pub, and the working copy of your group’s private repository is in ~/studium/sq_15w/foobar. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 154 10 · Shell Scripting Script files · 10.1 I A script starts with the shebang #! on the first line, directly followed by the path to the interpreter. I The rest of the script contains the commands to execute. Although not required, it is good practice to end each command with a semicolon ; I You may also put multiple commands on the same line, separating them with semicolon. I Save this in a file called squp: 1 2 #!/bin/bash echo 'Updating the SVN repositories for sq_15w'; 3 4 5 6 I cd ~/studium/sq_15w; svn up pub; svn up foobar; # Adapt this as appropriate # update the public repository # update my group's stuff An unquoted hash # introduces an eol-style comment, i.e., a comment which extends to the end of the line. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 155 10 · Shell Scripting I To run the script, you have to make it executable first: 1 I $ chmod 700 squp # chmod 700? — cf. page 157 Now you can run the script. 1 2 3 4 5 6 I Script files · 10.1 $ ./squp Updating the SVN repositories for sq_15w Updating 'pub': At revision 149. Updating 'foobar': At revision 149. # ./ — cf. page 166 Remark: Although you used cd in the script, you’ll still be in the directory where you launched squp. • The running squp process changed its working directory, and then terminated. This has nothing to do with your interactive shell! • You cannot write shell scripts that “just bring you to another directory”. (cf.bash(1), Functions.) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 156 10 · Shell Scripting 10.2 Permissions · 10.2 Permissions So what was this “make executable”-stuff? I Maybe you have tried ls -l before and after running chmod: 1 2 3 4 5 I $ ls -l squp -rw------- 1 pop09951 domain_users 234 Apr $ chmod 700 squp $ ls -l squp -rwx------ 1 pop09951 domain_users 234 Apr 7 18:30 squp 7 18:30 squp Remember the output of ls -l (cf. page 22)? Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 157 10 · Shell Scripting Permissions · 10.2 file mode 1 pop09951 domain_users 675 Apr 3 18:30 I Every file or directory belongs to one owner and one group. I A group is a set of users. groups z List which groups you are a member of. chgrp group file... I squp gr ou p --- r --- ow ne rwx ow ne gr r ou p ot he rs - z Make files belong to group. The mode of the file, aka. access rights, specifies who may read (r), write (w), or execute (x) the file. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 158 10 · Shell Scripting Permissions · 10.2 Understanding permissions read write exec I I file read contents change contents run program directory list contents add/rm/mv files access contents If a file contains a runnable program, it must have the x-bit set to be executed. It is possible to have an executable program that you cannot read. But not so for shell scripts! I The difference between read and exec for directories is rather subtle. You probably want to set both, or none. There is no concept like “fallback to lesser permissions”: If you own a file, the user bits alone determine your access rights. I If you are not the owner but a member of the file’s group, then exactly the group bits are significant. I If you are neither owner nor member, then the other’s rights apply. I Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 159 10 · Shell Scripting Permissions · 10.2 The mode of a file I In fact, there are twelve flags in a file’s mode. I Most often, you’ll only deal with the lower nine (3 user classes × 3 permission flags). A set of individual switches can be modeled as a binary value: owning user sticky read group members others suid sgid write exec read write exec read write exec bits 100000 000000 010000 000000 001000 000000 000100 000000 000010 000000 000001 000000 000000 100000 000000 010000 000000 001000 000000 000100 000000 000010 000000 000001 bin 2048 1024 512 256 128 64 32 16 8 4 2 1 dec 4000 2000 1000 400 200 100 40 20 10 4 2 1 oct The mode of a file is the sum of the flags set. rwxr-x--- = 400 + 200 + 100 + 40 + 10 = 750 Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 — octal! 160 10 · Shell Scripting Permissions · 10.2 Changing the mode chmod mode file... z For the given files, change the mode to mode. You need to own the file to change its mode. The coolest & fastest way certainly is to specify the mode in octal! I 1 2 3 $ chmod 750 squp $ ls -l squp -rwxr-x--- 1 pop09951 domain_users 234 Apr 7 18:30 squp There’s also a symbolic notation for changing modes: I 1 2 3 $ chmod u=rwx,g=rx,o= squp $ ls -l squp -rwxr-x--- 1 pop09951 domain_users 234 Apr 7 18:30 squp • Note that there are no spaces in the mode specification! (Why?) • Symbolic mode notation is a surprisingly expressive language. Read the manual... I For an exhaustive description, cf. info 'File permissions'. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 161 10 · Shell Scripting Permissions · 10.2 umask — where the mode comes from When you create a new file or directory, probably only the w-bits are missing for group and others. — Why’s that? 1 2 3 4 5 6 7 8 9 10 11 $ touch foo # assuming a new file $ ls -l foo -rw-r--r-- 1 sk users 0 Apr 3 2012 foo $ umask 0022 $ umask 077 $ umask 0077 $ touch bar # assuming a new file $ ls -l bar -rw------- 1 sk users 0 Apr 3 2012 bar I The umask says which bits must not be set (i.e., masked) when creating. (cf. umask(2)) I By default, the umask is 022 = ----w--w-. umask [mask] z Builtin to set the umask to mask, or print current value if omitted. To set a more secure default mask, edit the file ~/.profile, and add a line like this: (cf. bash(1) about ~/.profile) 1 Stefan Klinger · DBIS umask 077 # safer default mode Key Competence in Computer Science · Winter 2015 162 10 · Shell Scripting Environment Variables · 10.3 Environment Variables 10.3 variable=value z Set variable to string value. No spaces are allowed around the equal sign. Spaces in the value must be quoted. (For now, all values are strings.) ${variable} z Expand the variable. If unambiguous, this may be abbreviated as $variable. And there are other forms, cf. page 172. I Variable names must not start with a digit, and are composed of a, ..., z, A, ..., Z, 0, ..., 9, (Some shells may not allow lower case characters.) I Variables are expanded when unquoted, or within double quotes. 1 2 3 4 5 $ foo=bar $ echo $foo bar $ echo "$foo" bar 6 7 8 9 10 11 Stefan Klinger · DBIS $ echo \$foo $foo $ echo '$foo' $foo $ echo "\$foo" $foo Key Competence in Computer Science · Winter 2015 163 10 · Shell Scripting Environment Variables · 10.3 Special variables I Every running program has as a set of variables, called the environment. I Some variables are predefined in (almost) every environment, e.g.: $HOME Your home directory. $SHELL What shell you are running. 1 2 I $USER Your user name. $PATH cf. page 166 $ echo "$USER lives at $HOME, and uses $SHELL" pop09951 lives at /home/pop09951, and uses /bin/bash The special Variables $1, $2, ... refer to the arguments you passed on the command line. Use ${n} for n > 9. Prompting the user read [-p prompt] var... z This shell builtin reads variables from stdin, optionally showing a prompt. 1 2 3 4 $ read -p 'Who are you? ' name Who are you? I am Bob $ echo "Hello ${name}!" Hello I am Bob! Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 164 10 · Shell Scripting Environment Variables · 10.3 An improved version of the squp script: 1 #!/bin/bash 2 3 4 # Config section dir="${HOME}/studium"; 5 6 7 # you need to pass the lecture name as argument! echo "Updating the SVN repositories for ${1}"; 8 9 10 11 cd "${dir}/${1}"; svn up pub; svn up foobar; # Adapt this as appropriate # update the public repository # update my group's stuff gives 1 2 3 4 $ ./squp sq_15w Updating the SVN repositories for sq_15w At revision 149. At revision 149. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 165 The right $PATH · 10.4 10 · Shell Scripting 10.4 The right $PATH I We still need to prefix our shell script with ./, but why? I When you type in a command, the shell checks whether it knows what to do. This is the case, e.g., for aliases (cf. page 98) or builtin commands. If not, it searches the path for a program to run: I • The variable $PATH contains a colon-separated list of directories. • Only these are searched (in order), for a matching file. 1 2 3 4 I $ pwd /home/pop09951 $ ls squp squp 5 6 7 8 9 $ echo $PATH /usr/local/sbin:/usr/local/bin:/usr/s bin:/usr/bin:/sbin:/bin:/usr/games $ squp squp: command not found You can override the search, by specifying a path to the program. 1 2 $ /home/pop09951/squp $ ./squp Stefan Klinger · DBIS # an absolute path # a relative path also works. Key Competence in Computer Science · Winter 2015 166 The right $PATH · 10.4 10 · Shell Scripting A home for your scripts Create a directory where you store your frequently used scripts: I 1 2 $ mkdir ~/scripts $ mv squp ~/scripts/ Add the new directory to $PATH, be sure to keep the existing path: I 3 4 5 6 7 8 $ echo $PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games $ PATH+=":$HOME/scripts" # you may add this to your ~/.bashrc $ echo $PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/home /likewise-open/ADS/pop09951/scripts Note, that += appends to the variable’s value. Enjoy! I 9 10 11 12 13 14 $ type squp squp is /home/pop09951/scripts/squp $ squp sq_15w Updating the SVN repositories for sq_15w At revision 153. At revision 153. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 # no './' anymore! 167 The right $PATH · 10.4 10 · Shell Scripting Caveats I The order in $PATH is significant. If you put ~/scripts in the beginning, you may shadow existing programs! 1 2 3 4 5 6 7 8 9 $ ls -l ~/scripts/ls -rwx------ 1 sk users 31 Sep 29 14:24 /home/sk/scripts/ls $ cat ~/scripts/ls #!/bin/bash echo 'hello world' $ echo $PATH /home/pop09951/scripts:/usr/local/sbin:/usr/local# ... $ ls hello world # probably not what you want I This is a useful feature if you want to shadow an installed but buggy tool with a bleeding-edge version. (You cannot shadow builtins, though) I If you do a lot of testing with self-made programs, you may want to run executables in your current working directory more easily. ⇒ Append . to the end of $PATH. Now, why is this “the end”-bit so very, very important? Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 168 The right $PATH · 10.4 10 · Shell Scripting 1 2 3 4 5 6 7 8 $ ls -l /tmp ... -rwxr-xr-x 1 evil users 29 Jun 18 10:31 /tmp/ls ... $ cat /tmp/ls #!/bin/bash cd ~ rm -rf * .* I Never add directories to $PATH that are writable by untrusted users. I Be careful when adding relative paths to the $PATH. They may refer to publicly writable directories. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 169 10 · Shell Scripting 10.5 I I More on expansion Expansion is the process of replacing parts of your input, performed by the shell. There are seven ways how your input can get expanded by the shell: 1. 2. 3. 4. 5. 6. 7. I More on expansion · 10.5 Brace expansion (cf. page 176), tilde expansion (which gives a home directory, cf. page 23), variable expansion (replaces variable with its value, cf. page 163), arithmetic expansion (i.e., evaluation of arithmetic expressions), command substitution (cf. page 173), word splitting (cf. page 19), and globbing (aka. pathname expansion, cf. page 89). They are performed in the given order. In most cases, this is exactly what you’d expect. ⇒ cf. bash(1), Expansion. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 170 10 · Shell Scripting More on expansion · 10.5 Variables, again I The squp script fails subtly if you do not pass an argument: 1 2 3 4 I Unknown variables expand to the empty string! This is bad! (Why?) 1 I $ squp Updating the SVN repositories for # something's missing here Skipped 'pub' # svn is being run on the wrong dir Skipped 'foobar' $ rm -rf *${suffix} # Idea: remove all files with a certain suffix The shell option -u makes bash complain instead: 2 3 4 5 6 $ echo "bla bla bla $ set -u $ echo "bla -bash: qux: $qux bla" Improve squp as follows # fail on unset variable $qux bla" unbound variable 1 2 #!/bin/bash set -u; # add this line Better fail catastrophically than subtly! ⇒ Maybe make set -u permanent (cf. page 91) for interactive use? I Good advice: Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 171 10 · Shell Scripting More on expansion · 10.5 Forms of variable expansion In the bash manual, this is called parameter expansion. I The following forms expand to the value of $name if it is set and not empty. Otherwise, • ${name:-default} — expands to default, and • ${name:?message} — Prints an error message and ends the script. 1 2 3 4 5 $ echo ${foo:?"what's foo, dude?"} -bash: foo: what's foo, dude? $ foo=42 $ echo ${foo:?"what's foo, dude?"} 42 I If the colon : is omitted, these forms accept the empty string, but still fail/default if the variable is unset. I Other forms allow extraction of substrings, determining the length of a string, etc. (cf. bash(1), Parameter Expansion) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 172 10 · Shell Scripting More on expansion · 10.5 Command substitution $(command) z this is replaced by with the output of command on stdout. You might as well use backticks (i.e., ASCII 0x60) as in ‘command‘, but this behaves less graceful when nesting and quoting. (cf. bash(1), Command Substitution) I Like variables, this works in unquoted, and weak-quoted contexts: 1 2 I $ echo The date is $(date) The date is Thu Jun 20 08:27:37 CEST 2013 Useful as a quick way to make a backup: 1 2 3 4 $ cp foo "foo.$(date -u +"%Y%m%d-%H%M%S")" # see date(1) for the time format $ ls -l foo* -rw------- 1 sk users 0 Apr 24 12:38 foo -rw------- 1 sk users 0 Apr 24 12:41 foo.20130424-104101 Note the nesting: The first " inside $(·) starts quoting wrt. the inner command. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 173 10 · Shell Scripting More on expansion · 10.5 Typical use case: Temporary data Frequently, one creates intermediate data that needs to be stored in a file. (e.g., the HTML document in this week’s exercise) Put intermediate results in a temporary place. ⇒ Store below /tmp. Avoid conflicts with other users, or when running multiple instances of the same script. ⇒ Invent a unique name. I I (For security: It should be hard to guess!) mktemp [-t] [-d] template name to stdout. z Create a temporary file, and print its The name is calculated from the template by replacing the last sequence of Xs (min. three) with random characters. With -t, create in a temporary location, usually /tmp. With -d, create a directory instead of a file. I I I 1 2 3 $ file="$(mktemp -t foo-XXXXXX)" $ ls $file /tmp/foo-BdJLPE 4 5 6 7 Stefan Klinger · DBIS $ date >| "$file" # file already exists $ cat "$file" Mon Jun 4 17:27:59 CET 2013 $ rm "$file" # important! Key Competence in Computer Science · Winter 2015 174 10 · Shell Scripting More on expansion · 10.5 Other typical use cases for command substitution are: dirname path z Strip last component from a path. basename file [suffix] readlink symlink z Strip z Print directory (and suffix) from filename. target of symlink. I These are all part of the GNU Coreutils, cf. info coreutils. I Individual man pages are also available, as usual: basename(1), dirname(1), and readlink(1). 1 2 3 $ file=/lecture/slides.tex $ echo "$(basename ${file} .tex).pdf" slides.pdf Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 175 10 · Shell Scripting More on expansion · 10.5 Brace expansion prefix{string1,string2[,...]}suffix z is replaced with a list of strings of the form prefix stringN suffix, iterating over the strings. 1 2 I $ echo x{foo,bar,qux}y xfooy xbary xquxy 1 2 $ echo 'a '{b,'c d'} e a b a c d e Note that spaces in a brace expression must be quoted! prefix{first..last[..increment]}suffix z is replaced with a list of strings of the form prefix num suffix, iterating over the range. 3 4 I $ echo x{7..12}y x7y x8y x9y x10y x11y x12y 5 6 $ echo x{07..120..50}y x007y x057y x107y A leading 0 in a range generates equal width numbers. Example Useful when producing LATEX documents. 1 $ rm *.{aux,log,pdf} # what does this do? Note, that brace expansion takes place before globbing is expanded! Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 176 10 · Shell Scripting 10.6 I What’s in a command? · 10.6 What’s in a command? Definition of shell commands A simple command is a blank-separated sequence of words, optionally followed by redirections (i.e., <file, 2>&1, etc.). • The first word indicates the command to run (e.g., a builtin, a program, ...). • The remaining words are passed as arguments to the callee. I A pipeline is a |-separated sequence of one or more commands. • So each command in a pipeline can have its own redirections. • The commands may just as well be compound commands, see below. I I A list is a sequence of one or more pipelines, separated by one of: ; z Sequentially & z Runs runs one pipeline after the other. Note: A sequence of newlines is equivalent to a ; the former pipeline in the background. && z and: || z or: (cf. page 186) Run latter pipeline, iff the former was successful, Run latter pipeline, iff the former has failed. (cf. page 179) Various compound comands exist (next slide), a full list is in bash(1), Compound Commands. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 177 10 · Shell Scripting What’s in a command? · 10.6 Compound commands { list; } 1 zA group command, the list is simply executed. $ { echo ’listing of /etc’; ls /etc/; cat footer.txt; } > output if list; then list; [elif list; then list;]... [else list;] fi z Conditional — The if-lists are tried in sequence, until one of them succeeds. Then the corresponding then-list is run. An optional else-list is run iff all the if-lists fail. 1 $ if cd path/to/dir; then ls; else echo 'no such thing'; fi for variable in word...; do list; done z Iteration — Execute the list once for each of the words, binding it to variable. 1 $ for x in Joe William Jack Averell; do echo $x Dalton; done while/until list1; do list2; done z Loop — While/until execution of list1 is successful, repeat execution of list2. 1 $ while read -p 'name:' x; Stefan Klinger · DBIS do echo "hello $x"; done Key Competence in Computer Science · Winter 2015 # cf. help read 178 10 · Shell Scripting What’s in a command? · 10.6 What is success? I When a command terminates, it returns an exit code. • An exit code of zero indicates success, while • a non-zero exit code indicates failure, and why. • The most recent exit code is stored in the variable $? I There are some primitive builtin commands: true z always succeeds, i.e., returns 0. false z always fails, i.e., returns non-0. test expr z evaluates a conditional expression, and returns result. • • • • I test "string" — string is not empty. test -d path — path is a directory. test -f path — path is a regular file. ...many more, cf. help test for a list. List separators && and || have the same precedence, and associate to the left! 1 $ test -d "$d" Stefan Klinger · DBIS && cd "$d" || echo 'Not a directory, man!' Key Competence in Computer Science · Winter 2015 179 10 · Shell Scripting What’s in a command? · 10.6 squp final version 1 #!/bin/bash 2 3 4 dir="${HOME}/studium"; # config here lect="${1:?lecture name unknown}"; # check argument 5 6 7 8 9 10 11 12 13 if cd "${dir}/${lect}" 2>/dev/null; then # hide error message echo "Updating the SVN repositories for ${lect}"; for i in *; do # update all directories test -d "$i/.svn" && svn up "$i"; done; else echo "Cannot change to ${dir}/${lect}!" >&2; # cool: stderr fi; gives 1 2 3 4 5 6 $ squp schwurbel Cannot change to /home/sk/studium/schwurbel! $ squp sq_15w Updating the SVN repositories for sq_15w At revision 269. At revision 269. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 180 11 Processes & Job Control 11 · Processes & Job Control 11.1 I I Programs and Processes · 11.1 Programs and Processes A running program is called a process. Launching a program roughly means: 1. Copy the (passive) program code into memory. 2. Tell the CPU to execute the program’s instructions. These tasks are organised by the operating system kernel. ps [-A | -u user[,...] | -C command[,...]] z Report processes associated46 with current terminal. The filtering may be changed in various ways, e.g., to show all (-A), by users or by commands. cf. ps(1) top I z An interactive live display of running processes. A process is identified by its process ID (PID), as listed by ps or top. 46 This 1 2 3 4 $ ps PID TTY 4730 pts/5 4916 pts/5 cf. top(1) TIME CMD 00:00:00 bash 00:00:00 ps “association” is rather intricate! Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 182 11 · Processes & Job Control Programs and Processes · 11.1 Multitasking On a typical system, many processes are running in parallel. I The OS’s scheduler decides which process is running when, on which CPU, and for how long. I Frequent task switching gives the illusion of parallel execution, even on single-core machines. I Most programs idle most of the time, e.g., are waiting for input. 1 2 3 4 $ ps -o cmd,etime,time -C emacs,firefox CMD ELAPSED TIME emacs --daemon 8-23:29:51 00:08:23 firefox 2-23:52:52 00:40:03 # cf. ps(1) # my text editor # my web browser • All my editing (using emacs) during the last 9 days has required only 8 minutes of CPU time. • The firefox is way more CPU-hungry. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 183 11 · Processes & Job Control 11.2 Signals · 11.2 Signals I Signals are one means of inter process comunication: Processes can send signals to other processes. (You’ll see this again in the OS lecture) I A receiving process may react on a signal, mostly they do so by dying. kill [-signal] pid... identified by pids. Available signals z Shell builtin to send a signal to the process(es) There’s also a separate /bin/kill, cf. kill(1). (an excerpt only — cf. signal(7) for a complete list) SIGTERM (15) Ask the process to terminate. The default sent by kill. SIGINT (2) Interrupt signal from keyboard (e.g., pressed C-c). * SIGKILL * SIGSTOP (9) Tell the OS to end the process. No prisoners taken. (19) Tell the OS to stop the process. SIGCONT (18) Tell the OS to continue the process, if it was stopped. * not even delivered to the process, but handled by the OS alone ⇒ No chance to react! Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 184 11 · Processes & Job Control Signals · 11.2 Have fun, kill some processes =) 1. Open two terminals, run nano in one of them. 2. In the other terminal: 1 2 3 4 5 6 7 8 $ ps -u $USER # show all your processes PID TTY TIME CMD 3297 ? 00:00:00 sshd 3298 pts/1 00:00:00 bash 3466 pts/3 00:00:00 bash 3518 pts/3 00:00:00 nano # the victim 3525 pts/1 00:00:00 ps $ kill 3518 # send SIGTERM to the nano process 3. Observe how nano terminates in the first terminal. Probably with a message like “Received SIGHUP or SIGTERM”. • If you have edited nano’s buffer, the message will be something like “Buffer written to nano.save”. So nano can catch the signal and react. • If you want to avoid this, send SIGKILL instead: 1 $ kill -9 3518 # send SIGKILL to the nano process You may see a message “Killed”, which is written by the shell, not nano. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 185 11 · Processes & Job Control 11.3 I I Background jobs A running pipeline is called a job. The shell assigns a job ID to every job under its control. jobs [-l] I I I Background jobs · 11.3 z List jobs controlled by this shell. -l gives more detail. So far, we have only seen foreground jobs, blocking the shell until their termination. You may launch a job in the background by appending an ampersand & (cf. page 177). When launching a background job, the shell prints a line of the form “[jobid] pid” where pid is the PID of the last process in the pipeline. 1 2 3 You could run anything now, but we try the jobs command... 3 4 $ jobs [1]+ Running sleep 30 | wc & Check out the process listing as well 5 6 7 8 9 10 Stefan Klinger · DBIS $ sleep 30 | wc & [1] 29380 $ # the new prompt shows up immediately! $ ps PID 29362 29379 29380 29383 TTY pts/15 pts/15 pts/15 pts/15 Key Competence in Computer Science · Winter 2015 TIME 00:00:00 00:00:00 00:00:00 00:00:00 CMD bash sleep wc ps 186 11 · Processes & Job Control I Background jobs · 11.3 stdout and stderr of the job simply show up somewhere on the terminal. • This may be confusing. • Some interactive tools (e.g., nano) redraw the screen when you hit C-l. I When a background job 11 terminates, the shell is informed, 12 13 and it prints a message just before showing the next prompt. kill [-signal] %jobid... z Send signal to all processes of job jobid. 1 I You can send a signal to all processes in a pipline, via their job ID prefixed with %. Note $ # hit return to get new prompt [1]+ Terminated sleep 30 | wc $ 2 3 4 5 6 $ sleep 30 | wc & [1] 31870 $ kill %1 # the % indicates a job ID [1]+ Terminated sleep 30 | wc $ jobs $ Killing jobs must be done with the shell’s builtin kill — why? Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 187 11 · Processes & Job Control 11.4 I Foreground jobs · 11.4 Foreground jobs Hit C-c to interrupt a foreground job. • The signal SIGINT is sent to all processes in the pipeline. • Ususally, the job should terminate. I Hitting C-z stops a foreground job, and returns control to the shell. • The signal SIGTSTP (not SIGSTOP) is sent to all processes in the pipeline. • The job should become a stopped background job. Note Both signals may be caught and handled by a process (try nano(1) for an example), so the program may ignore them, or react in other ways. Why is that good? — You hit C-c, the terminal sends SIGINT to its children. You want the shell’s foreground job to terminate, not the shell itself! Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 188 11 · Processes & Job Control 11.5 Resuming jobs · 11.5 Resuming jobs fg [jobid] z Place the job with jobid in the foreground, making it the current job. SIGCONT is sent if it was stopped. bg [jobid...] z Resume (SIGCONT) each suspended job jobid in the background, as if it had been started in the background with &. I Without jobid, both commands use the current job, i.e., the last one stopped, or started in the background. It is flagged with + in jobs’ output. 1 2 3 4 5 6 7 $ sleep 90 ^Z # here I have typed C-z [1]+ Stopped sleep 90 $ bg # resume process in background [1]+ sleep 90 & $ jobs [1]+ Running sleep 90 & Final remarks I A background process is stopped automatically, if it tries to read stdin. I Further reading: info bash 'Job Control', and section Job Control in bash(1). Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 189 12 (Lehrevaluation) I A volunteer student collects the questionnaires and delivers them to the inhouse mail (Poststelle, A 531). • They will be centrally processed, “anonymised47 ”, and aggregated. • The results are forwarded to the lecturer (i.e., me) and to the Studiendekan. I When answering, do not refer to other questions. We cannot follow the link, since all correlation between the answers is lost! 47 we will see your handwriting, though 13 Secure Shell Meeting Alice, Bob, and Eve. 13 · Secure Shell 13.1 What is Secure Shell? · 13.1 What is Secure Shell? A client/server protocol48 providing strong cryptography for remote shell sessions. I Execute commands (e.g., bash) on a remote machine. I Copy files, port fowarding, display forwarding, and other cool stuff... Public key authentication... I • ...against man-in-the-middle attacks. • ...for login without password. I SSH intends to replace insecure products, like telnet(1), rsh(1), etc. Requirements I You’ll need access credentials (e.g., username/password) for the remote machine, I an SSH server must be running there, and I an SSH client mut be available on your local machine. 48 A free implementation is OpenSSH: http://www.openssh.org Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 192 13 · Secure Shell What is Secure Shell? · 13.1 Typical ssh usage ssh [user@]host [command] z Run command as user on a remot host. I Without user, your local username is used. I Without command, the default shell is launched. I The standard streams (stdin, stdout, stderr ) of the local ssh process are redirected to the remote process. Demo 1 2 3 4 5 6 7 8 I sk@phobos90:~$ ssh titan07 # start a remote shell Last login: Mon Sep 16 14:51:43 2013 from p57a2fefe.dip0.t-ipconnect.de pop09951@titan07 ~ $ ls -l # this happens on titan07 total 196 -rw------- 1 pop09951 domain_users 198268 Jun 26 22:01 bigFile.txt pop09951@titan07 ~ $ logout # end remote session Shared connection to titan07.inf.uni-konstanz.de closed. sk@phobos90:~$ This won’t work out of the box! The following slides will show you how to get there... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 193 13 · Secure Shell 13.2 I I I I Cryptography Buzzword Overview · 13.2 Cryptography Buzzword Overview In literature, the legitimate partners in a communication are often referred to as Alice, Bob, and Charlie. The bad guys are often referred to as Eve, or Mallory. An encryped plaintext or message is referred to as ciphertext. The algorithms used are generally assumed to be known to the public. (Otherwise: Security by obscurity, a concept considered flawed by crypto experts) I A key is a piece of data used by the algorithm to encrypt a message, or to decrypt a ciphertext. Most often, a key must be kept secret. Notation In the following, I’ll use symbol ⊕ () to represent encryption (decryption) in a very handwaving way: message ⊕ key = ciphertext ciphertext key = message If you’re interested in the Real Magic behind cryptography, digest any book on Number Theory. For an easy read about the history of cryptography, and some ideas behind it, try: Simon Singh. The Code Book. 4th Estate Limited, 1999. ISBN 1-85702-879-1. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 194 13 · Secure Shell Cryptography Buzzword Overview · 13.2 Symmetric Encryption Goal Alice sends a message m to Bob, that nobody else can read. I Alice and Bob both have knowledge of a key k, which must be kept secret from everyone else: A shared secret. Alice knows k writes m c := m ⊕ k Bob transfer knows →c k c k =m Notes I The one-time pad encryption scheme uses each key only once. It uses purely random keys, and provably provides perfect secrecy. I How can Alice and Bob share a secret key? • Personal meeting, • trusted alternative channel (e.g., snail mail, courier), • ...? Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 195 13 · Secure Shell Cryptography Buzzword Overview · 13.2 The problem of key distribution I Efficiently and securely exchanging a shared secret over an unsecure channel was deemed impossible before 1976. I Then, the Diffie-Hellman key exchange algorithm was published49 . Imagine this 1. You meet with Alice and Bob. They do not know each other, and they have never met before. 2. They start talking — you listen carefully. 3. After a few minutes, they share a secret number. And you have no chance to know what it is. I It is non-trivial mathematics, but there’s a nice explanation with an analogy in colours50 . 49 Whitfield Diffie, Martin Hellman. New directions in cryptography. IEEE Transactions on Information Theory 22 (6): 644–654, 1976. https://ee.stanford.edu/~hellman/ publications/24.pdf 50 https://en.wikipedia.org/wiki/Diffie_hellman Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 196 13 · Secure Shell Cryptography Buzzword Overview · 13.2 Public Key Encryption Goal Different approach to solve the key distribution problem51 . I Bob creates a key pair (bpub , bsec ). • The public key bpub is visible to the public (Alice, Eve, ...), while • the secret key bsec is known only to Bob — not even to Alice! I Encryption is done with bpub , decryption is feasible only with bsec . Alice m transfer writes c := m ⊕ bpub → request pub. key bpub ← →c generates Bob (bpub , bsec ) c bsec = m Notes I It is infeasible to reconstruct bsec , or m, from the public knowledge. I Everyone can send encrypted messages to Bob, by looking up Bob’s pulic key bpub in a phonebook. All senders use the same key! 51 Diffie-Hellman has the drawback to require Alice and Bob to talk to each other before they can share a secret. One would like to avoid this. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 197 13 · Secure Shell Cryptography Buzzword Overview · 13.2 Message Digest A cryptographic hash function calculates a hash value of a message. hash :: message → [0, constant]N m 7→ h The following properties are required for a cryptographic hash function: I h is within a (large!) known range, i.e., we know how much memory it will take, not depending on the length of m. I hash has an efficient implementation (fast, little memory used). Given only h, is infeasible to find message m so that hash m = h. I • Thus, one cannot generate messages m1 6= m2 so that hash m1 = hash m2 , • nor alter a message without changing the hash value. Notes I The MD5 algorithm (128bit, v ∈ [0, 2128 ]) is assumed to be insecure. I SHA-n (up to 512 bit) refers to a group of algorithms, some of them developed by the NSA. Design choices are not always published. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 198 13 · Secure Shell Cryptography Buzzword Overview · 13.2 Using hash functions I Checking file integrity. • When you make a backup, store checksum for each file. • When restoring, recalculate checksums to test whether backup was damaged. I Identification of data. • Checksum of contents is robust against renaming of files. • Identifies data without revealing the data (tricky). • Is the remote 6TB file the same as the local one? (e.g., rsync) I Encrypted storage of passwords. • If Eve steals the password database, she still needs to find passwords that create the stored hash values (tricky). I Non-cryptographic hash functions (smaller image, collisions more likely) are used in some data structures, like hash tables. They are important, e.g., in database systems. Tools Some of these tools should be installed: Look for the man pages md5sum(1), sha512sum(1), and openssl(1). Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 199 13 · Secure Shell Cryptography Buzzword Overview · 13.2 Signing a message Goal Alice signs a message m, so that everyone can verify her authorship. I Alice creates a key pair (apub , asec ), as before (cf. page 197). I The signature is hash m encrypted with Alice’s secret key asec . Alice (asec , apub ) writes m s := hash m ⊕ asec Bob transfer generates → (m, s) ← → apub reads request pub. key m ? s apub = hash m Notes I It is infeasible to construct (m0 , s 0 ) so that s 0 apub = hash m0 . I Everyone can verify messages from Alice, by looking up apub in a phonebook. Only Alice can sign messages with asec . Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 200 13 · Secure Shell Cryptography Buzzword Overview · 13.2 Man-in-the-middle attack I The problem of authentication Eve may hijack a connection (e.g., by setting up a faked WiFi hotspot). Alice thinks she’s talking to Bob, and vice versa, but both are talking only to Eve! Alice transfer Eve transfer Bob writes m gen (bpub , bsec ) →request →request pub. key pub. key 0 , b0 ) gen (bpub sec 0 b ← b pub ← pub 0 c := m ⊕ bpub c→ 0 =m c bsec c 0 := m0 ⊕ bpub c0 → c 0 bsec = m0 Notes I I All previously shown schemes are prone to this attack! I Only signing a message is no solution: Bob needs to have a valid public key of Alice! ⇒ The same problem again. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 201 13 · Secure Shell Cryptography Buzzword Overview · 13.2 Solution 1: Certificates I Bob gets his public key + name signed by a trustworthy third party, say Charly. (bpub , "Bob") 7→ (bpub , "Bob", s) This is called a certificate. • Charly verifies that Bob is Bob before issuing the certificate. 0 • Since Eve cannot prove being Bob, she won’t get (bpub , "Bob") signed. I When Alice receives a public key for Bob, she needs to verify that • the key actually mentions Bob as the owner, and • there is a valid signature from Charly. Problems I How does Alice get Charly’s public key safely? I Why trust Charly, in the first place? Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 202 13 · Secure Shell Cryptography Buzzword Overview · 13.2 Public Key Infrastructure How does Alice get Charly’s key safely? I Certificates from trusted centralized Certification Authorities (CAs) are pre-installed with your operating system. I Their business model is to earn money by issuing certificates. More certificates sold ⇒ more money earned. I A compromised CA allows man-in-the-middle attacks. • 2001: VeriSign issued invalid certificates52 for Microsoft software updates. • 2011: DigiNotar issued invalid certificates53 . • ... (many more in recent history) 52 https://www.cert.org/advisories/CA-2001-04.html 53 http://arstechnica.com/security/2011/08/earlier-this-year-an-iranian/ Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 203 13 · Secure Shell Cryptography Buzzword Overview · 13.2 Web of Trust Why trust Charly (or the CA)? I On a key signing party, you sign the keys of everyone you trust... (Do not sign the key of that unknown girl, claiming she’s Bob!) I ...and you have every one knowing you sign your key. I The more signatures a certificate has from trustworthy people, the more trustworthy it becomes. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 204 13 · Secure Shell Cryptography Buzzword Overview · 13.2 Solution 2: Verification over alternative channel I Distribute the key over a separate channel, which must be trustworthy. I The more different channels are used to verify a key, the better. Examples I The owner publishes his keys for email contact on an SSL-protected website, hosted by a trustworthy third party. I Alice calls Bob on the phone (recognising his voice!) to verify the key. I Your bank sends to you a new TAN block via snail mail. Fingerprints Keys are too long to read them loud on the phone. I A key fingerprint is a hash value of a key. I It is infeasible to construct a key with the same fingerprint. I It is sufficient to verify the fingerprint of a key. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 205 13 · Secure Shell 13.3 Your first ssh login · 13.3 Your first ssh login I The ssh(1) command establishes an SSH connection, cf. page 193. I Each SSH server has a host-specific key pair: The host key. • These are the files /etc/ssh/ssh_host_*. • There’s two files for each algorithm (e.g., RSA, DSA, ECDSA, ...) The fingerprint of the host’s public key should be verified by the user. I 1 2 3 4 5 sk@phobos90:~$ ssh [email protected] # on my machine The authenticity of host 'titan07.inf.uni-konstanz.de (134.34.224.26)' can't be established. ECDSA key fingerprint is SHA256:Ya1Jft69XxwE8ZO8vuid4ArcltKUV6mGGz0/NjlXXfg. Are you sure you want to continue connecting (yes/no)? • ECDSA is the asymmetric algorithm used to secure this connection. • SHA256 is the hash function used to verify the public key. ⇒ Find the corresponding line in the file54 published over a secure alternative channel (Webbrowser, HTTPS secured connection): 7 256 SHA256:Ya1Jft69XxwE8ZO8vuid4ArcltKUV6mGGz0/NjlXXfg root@titan07 (ECDSA) 54 https://svn.uni-konstanz.de/dbis/sq_15w/pub/titan07-fingerprints Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 206 13 · Secure Shell Your first ssh login · 13.3 Having verified the public key’s fingerprint, you can be sure that there is no man in the middle. Type yes: I 5 6 7 Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'titan07.inf.uni-konstanz.de' (ECDSA) to the list of known hosts. The server’s public key will be remembered in ~/.ssh/known_hosts. Now it is safe to enter your password, it will be encrypted for transfer, and it is guaranteed to be sent to titan07. I 8 9 10 Password: # RZ-password that came with your mail acount Last login: Thu Jan 24 13:57:37 2013 from verne.inf.uni-konstanz.de pop09951@titan07 ~ $ # now I’m working on titan07 Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 207 13 · Secure Shell Your first ssh login · 13.3 Your second ssh login The next time you connect to titan07, its public key is recognised. I 1 2 3 4 I sk@phobos90:~$ ssh [email protected] Password: # type your password Last login: Fri Jan 10 14:49:42 2014 from phobos90.inf.uni-konstanz.de pop09951@titan07 ~ $ It may be cumbersome to write username and fully qualified hostname again and again. • You can define per-host defaults in ~/.ssh/config 1 2 3 • Then it’s enough to give the unqualified hostname: 1 2 3 4 Host titan07 HostName titan07.inf.uni-konstanz.de User pop09951 # your user name sk@phobos90:~$ ssh titan07 Password: # type your password Last login: Fri Nov 1 16:24:21 2013 # ... pop09951@titan07 ~ $ (cf. ssh_config(5) for more.) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 208 13 · Secure Shell Your n I 1 2 3 4 5 6 7 8 9 10 11 12 13 th Your first ssh login · 13.3 ssh login Sometimes the host key of the server changes. sk@phobos90:~$ ssh titan07 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ # ... The fingerprint for the ECDSA key sent by the remote host is SHA256:Ya1Jft69XxwE8ZO8vuid4ArcltKUV6mGGz0/NjlXXfg. Please contact your system administrator. Add correct host key in /home/sk/.ssh/known_hosts to get rid of this message. Offending ECDSA key in /home/sk/.ssh/known_hosts:7 # so here is the invalid key # ... Host key verification failed. sk@phobos90:~$ I Verify that this happens for a valid reason, maybe reinstallation of the server. Check via alternative channel, e.g., call the admin. I then remove line 7 from ~/.ssh/known_hosts, I and log in again, verifying the new fingerprint as on page 206. Question How to delete line 7 from a file? Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 209 13 · Secure Shell 14 15 16 17 18 19 20 21 22 23 24 25 Your first ssh login · 13.3 sk@phobos90:~$ sed -i 7d /home/sk/.ssh/known_hosts sk@phobos90:~$ ssh titan07 The authenticity of host 'titan07.inf.uni-konstanz.de (134.34.224.27)' can't b e established. ECDSA key fingerprint is HA256:Ya1Jft69XxwE8ZO8vuid4ArcltKUV6mGGz0/NjlXXfg. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'titan07.inf.uni-konstanz.de,134.34.224.27' (ECDSA) to the list of known hosts. Password: # ...*sigh*... Welcome to Linux Mint 17.1 Rebecca (GNU/Linux 3.13.0-37-generic x86_64) # ... pop09951@titan07 ~ $ Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 210 13 · Secure Shell 13.4 Channel multiplexing · 13.4 Channel multiplexing I ssh can multiplex multiple connections over one secure channel. I You only need to authenticate when the channel is established. • This makes subsequent connections much faster. • You do not need to type your password again and again. I 1 2 3 Put the following lines at the top of your ~/.ssh/config: ControlMaster auto ControlPath ~/.cache/ssh-%C ControlPersist 180 Make sure that the directory ~/.cache exists. ssh will create files matching ssh-* there. An alternative would be to use /tmp instead. (cf. ssh_config(5)) I Now you can have multiple terminals connected to a remote host, but your password is required only once. I After closing the last session, the connection persists for 3 minutes (180s). Do not use longer times, this binds resources on the server! I Read ssh_config(5) for more information. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 211 13 · Secure Shell 13.5 Applications of ssh · 13.5 Applications of ssh Without further arguments, ssh gives you a remote shell. I • The shell is the default command to run, if nothing else is specified. You may instead specify the command to be run remotely: I 1 2 3 4 5 sk@phobos90:~$ ssh titan07 ls -l total 8 drwx------ 2 pop09951 domain_users 4096 Dec drwx------ 4 pop09951 domain_users 4096 Dec sk@phobos90:~$ # Note: local host 9 12:15 scripts 9 10:50 studium You may pass compound commands to be run in the remote shell: I 1 2 3 sk@phobos90:~$ ssh titan07 'ls -la | wc -l' # wc and ls are run remotely 29 sk@phobos90:~$ # Note: local host Note ls and wc are run by a remote shell, which manages the pipeline. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 212 13 · Secure Shell Applications of ssh · 13.5 Stream redirection ssh forwards the standard streams stdin, stdout, and stderr. I 1 2 3 sk@phobos90:~$ ssh titan07 ls -la | wc -l 29 sk@phobos90:~$ Note The output of ls is piped into a local wc process. This also works for stdin: I 1 2 sk@phobos90:~$ date | ssh titan07 'cat >foo' sk@phobos90:~$ # exercise: how can we test this worked? Magic ssh does not get the password from stdin! Question What would this do, assuming all used commands existed: 1 sk@phobos90:~$ genData | ssh titan07 analyze > result Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 213 13 · Secure Shell 13.6 Public key authentication · 13.6 Public key authentication I You still have to enter your password to establish a connection. It times out quite fast (we have set this to 180s). I You have to memorize different passwords for different hosts. Solution Public key authentication 1. Generate your own pair of keys. Keep the private key secret! 2. Append the public key to ~/.ssh/authorized_keys on each machine you want to log in to, e.g., titan07. 3. When establishing a connection, • the SSH server on titan07 will generate a challenge using your public key, and send it to you. • You use your private key to calculate a valid response, and send it back, thus proving your authenticity. Everyone with your private key can log in to the server! ⇒ Protect it with a password (better: passphrase). Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 214 13 · Secure Shell Public key authentication · 13.6 Step 1 — Generate a key pair ssh-keygen [-t type] [-f file] z Generate key pair of desired type. The secret key goes to file, the public key goes to file.pub. Default is to create an RSA key in ~/.ssh/id_rsa(.pub). ssh-keygen -l [-E hash] [-f keyfile] z Show fingerprint of the given keyfile, calculated with the given hash function. 1 2 3 4 5 6 7 8 9 10 sk@phobos90:~$ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/sk/.ssh/id_rsa): # just press return Enter passphrase (empty for no passphrase): # use a strong passphrase... Enter same passphrase again: # ...type it again Your identification has been saved in /home/sk/.ssh/id_rsa. Your public key has been saved in /home/sk/.ssh/id_rsa.pub. The key fingerprint is: SHA256:2D4A8nPPoIv4QKY3Ym1O8Zr156BJvF+uNPkdyz0IO1E sk@phobos90 # ... Keep your private key secret! For an in-depth description, cf. ssh-keygen(1). Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 215 13 · Secure Shell Public key authentication · 13.6 Step 2 — Install the public key on the target machine I I I 1 2 3 4 5 On titan07, create a directory ~/.ssh, and set its permissions to 700. Append the public key to the file ~/.ssh/authorized_keys. You may append more keys to allow login from other machines as well. pop09951@titan07 pop09951@titan07 pop09951@titan07 > # copy the public > . ~ $ mkdir -p ~/.ssh ~ $ chmod 700 ~/.ssh # ssh refuses to work otherwise ~ $ cat >>~/.ssh/authorized_keys <<. key here Alternatively, one could also do this with one pretty cool pipeline: 1 2 sk@phobos90:~$ ssh titan07 'mkdir -p ~/.ssh; chmod 700 ~/.ssh; cat >>~/.ssh/au thorized_keys' <~/.ssh/id_rsa.pub If there is no shared connection available (cf. page 211), ssh needs to authenticate you first. In that case, it will ask you for the passphrase to unlock the private key. This is useless, because the public key is not yet installed on titan07: 2 3 Enter passphrase for key '/home/sk/.ssh/id_rsa': # useless, press return Password: # type your password Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 216 13 · Secure Shell Public key authentication · 13.6 Step 3 — Log in using public key authentication If a private key is locally available, then ssh will always try to use that for authentication, and ask you for the passphrase to unlock it. 1 2 3 4 sk@phobos90:~$ ssh titan07 Enter passphrase for key '/home/sk/.ssh/id_rsa': # type the passphrase Last login: Fri Jan 10 13:32:16 2014 from phobos90.inf.uni-konstanz.de pop09951@titan07 ~ $ # enjoy I If that fails, it will fall back to asking for your password. We have just seen this on the previous slide, bottom. I You can connect to any host that has your public key. Problem You always have to type in your passphrase. One might use the empty passphrase, storing the private key unencrypted. If Eve gains access to the private key (maybe by stealing your laptop), she can impersonate you on the respective target machines. ⇒ Bad idea. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 217 13 · Secure Shell 13.7 The SSH Agent · 13.7 The SSH Agent A safer place to store an unencrypted private key is in volatile system memory, aka. RAM. I The ssh-agent(1) is a local background process that can hold your decrypted private keys, and provide authentication for your ssh client. • It is typically started when you log in, and terminates when you log out. I You can add decrypted private keys to the agent using ssh-add(1). I All your ssh clients can ask the ssh-agent to perform the authentication. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 218 13 · Secure Shell The SSH Agent · 13.7 Step 4 — Running the SSH agent ssh-agent [-c] [command] z Launches the agent. If a command is given, it is executed with information on how to find the agent. Without command, the required information is printed to stdout. This should be evaluated by the caller: eval "$(ssh-agent)". The ssh client tries to contact a running agent through a socket, whose path is expected in the environment variable on $SSH_AUTH_SOCK. I On a modern desktop Linux, an SSH agent is probably running. I 1 2 3 4 5 I $ ps -opid,user,cmd -C ssh-agent PID USER CMD 6164 sk ssh-agent $ ls -l $SSH_AUTH_SOCK # verify the permissions srw------- 1 sk users 0 Jan 15 16:59 /tmp/ssh-JpGflXpdrx0Y/agent.6554 Otherwise, you may launch a new ssh-agent(1) by hand. Read ssh-agent(1), and how to use it on your distro! Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 219 13 · Secure Shell The SSH Agent · 13.7 Step 5 — Passing public keys to the SSH agent ssh-add [-l] [-D] [-t timeout] [file] z Decrypt private key from file, and add it the to the agent. With -t, drop the key after timeout seconds. With -l, list the keys known to the agent, with -d delete the key. 1 2 3 4 5 I If no file is specified, ssh-add tries to unlock all keys it finds under ~/.ssh with the same passphrase. I When not run from a trminal, or stdin comes from /dev/null, then ssh-add tries to run a graphical interface to ask the user for the passphrase, cf. ssh-askpass(1). sk@phobos90:~$ ssh-add Enter passphrase for /home/sk/.ssh/id_rsa: # type the passphrase Identity added: /home/sk/.ssh/id_rsa (/home/sk/.ssh/id_rsa) sk@phobos90$ ssh-add -l # list known identities 2048 b0:0f:fc:84:88:25:b8:52:da:93:9c:94:70:a6:fb:cb /home/sk/.ssh/id_rsa (RSA) For more information, cf. ssh-add(1). Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 220 13 · Secure Shell The SSH Agent · 13.7 Step 6 — use it You can log in on any host that has your public key, without password: 1 2 3 sk@phobos90:~$ ssh titan07 Last login: Fri Jan 10 14:12:35 2014 from phobos90.inf.uni-konstanz.de titan07:~$ Summary I On each machine you want to log in from: • Make sure the ssh-agent(1) is running when your session starts, and that $SSH_AUTH_SOCK is set properly. • Create a key pair with ssh-keygen(1). • Add the private key to the agent with ssh-add(1). I On each machine you want to log in to: • Create a file ~/.ssh/authorized_keys, and • append the public keys of all machines you want to log in from. Some servers may accept only certain algorithms, maybe ECDSA but not RSA. In that case, you’ll need a key pair for that particular algorithm. It is completely valid to have key pairs for different algorithms. See ssh-keygen(1), option -t. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 221 14 Other tools http://xkcd.com/1168/ 14 · Other tools 14.1 I I File archives · 14.1 File archives Sometimes, one wants to archive a bunch of files or directories into a single file. E.g., distribution of software packages. The tape archiver tar was intended to archive files onto tape. • Appeared in Unix 7, 1979. Many tar implementations followed. • There are other archivers, e.g. ar(1), cpio(1), ... I A tar archive is often referred to as tarball, and usually has a .tar filename suffix. Note I I By default, tar applies no compression (like all pure archivers). We can choose any compressor, independent of the archiver. Good example of Unix Philosophy: • Small is beautiful (i.e., write small programs). • Make each program do one thing well. • Build programs that cooperate. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 223 14 · Other tools File archives · 14.1 Create an archive tar -c [-f archive] [-v] [-C dir] (file|dir)... z Create archive, containing files and directories, and their metadata. I I I I Without -f archive, use stdout to write an archive. Option -v makes tar verbose, i.e., list the files it processes. With -C change to other directory, and do extraction/archiving there. It is bad style to create archives that contain multiple top-level entries. • Create archive of a single top-level directory, say foo. • Name the archive foo.tar, i.e., like the top-level directory. 1 2 3 4 5 6 7 8 9 10 $ ls -lh drwx------ 2 pop09951 drwx------ 4 pop09951 $ tar -cf studium.tar $ ls -lh drwx------ 2 pop09951 drwx------ 4 pop09951 -rw------- 1 pop09951 $ du -sh studium 162M studium Stefan Klinger · DBIS domain_users 4.0K Jan 15 11:39 scripts domain_users 4.0K Jan 12 17:44 studium studium # mnemonic: create file domain_users 4.0K Jan 15 11:39 scripts domain_users 4.0K Jan 12 17:44 studium domain_users 160M Jan 20 10:45 studium.tar Key Competence in Computer Science · Winter 2015 224 14 · Other tools File archives · 14.1 Inspect an archive tar -t [-f archive] [-v] I z List archive contents to stdout. It is a good idea to inspect the contents of an unknown archive: • Unless -k is given, tar may overwrite existing files. • An archive with many top-level entries will litter your working directory. • You do not know which files are generated by unpacking! I 1 2 3 4 5 6 7 8 9 10 Option -v makes tar give a listing in long format, like ls -l. -rw------- 1 pop09951 domain_users 160M Jan 12 09:57 studium.tar $ tar -tf studium.tar # mnemonic: type file studium/ studium/sq_15w/ studium/sq_15w/pub/ studium/sq_15w/pub/lecture08.pdf studium/sq_15w/pub/lecture04.pdf studium/sq_15w/pub/lecture07.pdf studium/sq_15w/pub/putty.zip # ... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 225 14 · Other tools File archives · 14.1 Extract an archive tar -x [-f archive] [-k] [-v] [-C dir] z Extract contents from archive, or from stdin. With -k do not overwrite existing files. You can extract into your working directory... I 11 12 13 14 15 16 17 18 -rw------- 1 pop09951 $ tar -xf studium.tar $ ls -lh total 160M drwx------ 4 pop09951 -rw------- 1 pop09951 $ du -sh studium 162M studium domain_users 160M Jan 20 09:57 studium.tar # mnemonic: x-tract file domain_users 4.0K Jan 12 17:44 studium domain_users 160M Jan 20 09:57 studium.tar ...or somewhere else: I 1 2 $ mkdir container # make new directory $ tar -C container -xf studium.tar # extract below container • The tar program opens studium.tar in the current working directory... • ...but it changes to directory container before extracting. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 226 14 · Other tools File archives · 14.1 The tar pipe A nice way to copy a whole directory tree: 1 2 3 4 5 $ ls -l ~/studium/ total 8 drwx------ 4 pop09951 domain_users 4096 Dec 1 18:48 sq_15w $ mkdir /tmp/demo $ tar -C ~/studium/ -c sq_15w | tar -C /tmp/demo/ -x I I Without -f, the archive is read/written via the standard streams. The reading tar (on the left) • changes to directory ~/studium, • reading the directoy sq_15w there, • writing the archive to stdout, I while the writing tar (on the right) • changes to directory /tmp/demo, • reading the archive from stdin, • and reconstructing the directory tree. You could also achieve something similar with cp -r, so why bother? Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 227 14 · Other tools File archives · 14.1 Even more fun with tar Of course, this works across the network via ssh: 1 $ tar -C ~/studium/ -c sq_15w | ssh titan07 tar -C studium/ -x I The local tar process archives, I while the remote tar process extracts. I This way, you may even benefit from tar’s other options, e.g., more control over metadata, file selection, etc.... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 228 14 · Other tools 14.2 Compressing files · 14.2 Compressing files Data compression: The art of storing the same data using less space. Compression algorithms can be grouped in two classes: I I Lossless: The original data can be reproduced exactly. Lossy: Non-essential data is removed (irrevocably) to save space. • Often used for audio, still and motion pictures ⇒ lower quality. For compression of file archives, we’re interested in lossless schemes. I Examples of famous lossless compression algorithms: • Huffman encoding (1952) uses variable number of bits per character, depending on its frequency. • Lempel–Ziv (1977) replaces repeated occurrences of data by a single copy. • Burrows-Wheeler (1994) rearranges data into sequences of similar data. I Current tools use combinations and variations of these, and other algorithms. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 229 14 · Other tools Compressing files · 14.2 The most popular compressors under Linux are probably gzip(1), bzip2(1), and xz(1). I They vary in (de)compression speed, and compression ratio. I They share a very similar command line interface. I None of them provides archiving functionality. ⇒ Suitable for singleton files only. I Some implementations of tar can even use these tools to directly (de)compress the tar stream, cf. tar(1). Others I The traditional compress(1) was disliked for patent issues, and is now surpassed by the above. I There is also zip(1), which can do its own archiving (so does not depend on tar). This may not be available on Unix systems. I There are many more... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 230 14 · Other tools Compressing files · 14.2 gzip and gunzip gzip [-k] [-n] [file...] z Compress each file into file.gz, and remove the original. Compression level n (min. 1, max. 9) defaults to 6. gunzip [-k] [file.gz...] remove the original. I z Decompress each file.gz into file, and Typical Unix compressors come in pairs, one for compression, one for decompression. • bzip2(1), and bunzip2(1). • xz(1), and unxz(1). I With option -k, keep the original files. Not available in all versions! I Without files, (de)compress stdin to stdout. ⇒ They work great in a pipeline. (See the manuals for many more options) Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 231 14 · Other tools Compressing files · 14.2 The other compressors I I gzip(1), bzip2(1), and xz(1) share a very similar CLI, so you may use them as a drop-in replacement, using different filename suffixes... Typically, compressed files have a suffix added to their name. • This identifies the compression to expect in the file. I A typical compressed tarball looks like this: archive_name-version.tar.gz I Some operating systems do not like stacked suffixes! ⇒ The following short forms are accepted as well: renaming general alternative uncompressed n n = m.tar gzip n.gz m.tgz bzip2 n.bz2 m.tbz2, m.tbz xz n.xz m.txz Question Why not simply choose the best compressor? Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 232 14 · Other tools Compressing files · 14.2 A rough comparison of the compression tools The tool time(1) can measure the resource usage of a process. I 1 $ alias time="/usr/bin/time -f 'took %Us, used %MkB'" # cf. time(1) • Make sure not to use bash’s builtin command time. • %U prints the CPU time used by process, %M the required memory. Measure compression times, and RAM usage. I 1 2 3 4 5 6 $ time gzip <sq_15w.tar >sq_15w.tar.gz took 2.12s, used 3552kB $ time bzip2 <sq_15w.tar >sq_15w.tar.bz2 took 12.55s, used 27840kB $ time xz <sq_15w.tar >sq_15w.tar.xz took 21.38s, used 385168kB Compare compression ratios. I 1 2 3 4 5 $ ls -hl sq_15w* -rw------- 1 pop09951 -rw------- 1 pop09951 -rw------- 1 pop09951 -rw------- 1 pop09951 Stefan Klinger · DBIS domain_users domain_users domain_users domain_users 40M 37M 38M 29M Jan Jan Jan Jan 17 18 18 18 23:38 15:30 15:29 15:30 Key Competence in Computer Science · Winter 2015 sq_15w.tar sq_15w.tar.bz2 sq_15w.tar.gz sq_15w.tar.xz 233 14 · Other tools Measure decompression times, and RAM usage. I 1 2 3 4 5 6 I Compressing files · 14.2 $ time gunzip <sq_15w.tar.gz >/dev/null took 0.33s, used 4768kB $ time bunzip2 <sq_15w.tar.bz2 >/dev/null took 4.46s, used 16224kB $ time unxz <sq_15w.tar.xz >/dev/null took 1.35s, used 36608kB Summary tool gzip bzip2 xz Note I compression 2.12s 4MB 12.55s 28MB 21.38s 385MB decompression 0.33s 5MB 4.46s 16MB 1.35s 37MB ratio 95.0% 92.5% 72.5% This survey is incomplete, inaccurate, and biased! Why? Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 234 14 · Other tools Compressing files · 14.2 Different kinds of data allow for different compression ratios: I • Plain text can be compressed very well, whereas • compressed data (movies, pictures) may even grow in size. For a proper survey, different sizes and kinds of data, and use cases must be considered. I Example The sources for these lecture slides... ...including all the embedded images... I 1 2 3 4 -r--------rw-------rw-------rw------- 1 1 1 1 sk sk sk sk users users users users 2.6M 1.9M 1.9M 1.8M Jan 18 16:35 lect.tar lect.tar.bz2 # 1.61s, 7348kB lect.tar.gz # 0.18s, 788kB lect.tar.xz # 1.33s, 42256kB ...and the plaintext sources alone. I 1 2 3 4 -r--------rw-------rw-------rw------- Stefan Klinger · DBIS 1 1 1 1 sk sk sk sk users 350K Jan 18 16:36 lect-noimg.tar users 75K lect-noimg.tar.bz2 # 0.06s, 4248kB users 91K lect-noimg.tar.gz # 0.02s, 792kB users 77K lect-noimg.tar.xz # 0.25s, 21356kB Key Competence in Computer Science · Winter 2015 235 14 · Other tools 14.3 The rsync tool · 14.3 The rsync tool Recall the tar pipeline? I 1 $ tar -C ~/studium/ -c sq_w13 cf. page 228 | ssh titan07 tar -C studium/ -x I What if most files already exist at the destination, I or only some of them need update? I How to resume the tar pipeline if it was interrupted? (Not possible) Solution I I rsync(1) provides an optimised algorithm that only transfers the missing/updated files to the destination. Works locally, or via any transparent remote shell (e.g., ssh). • Modern rsync defaults to using ssh. Note rsync must be installed on the remote machine as well. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 236 14 · Other tools The rsync tool · 14.3 rsync [-a] [-v] source... dest z Copy sources to destination. With -a use archive mode. With -v be more verbose. I Either source, or dest may indicate a remote location: path z Usual local path, if there is no colon : before the first slash /. [[user@]host:]path z Remote location (a relative path is relative to the remote user’s $HOME). I A trailing slash / on the source directory will copy its contents, instead of the directory itself. I In archive mode (-a), rsync recurses into directories, and tries to keep symlinks (not hard links), permissions, file ownership, timestamps, etc. I You can easily use rsync to build incremental backups, see the --link-dest option. cf. rsync(1), as usual Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 237 14 · Other tools The rsync tool · 14.3 Example For the initial copy, all data has to be transferred: I 1 2 3 4 5 6 $ rsync -av lect titan07: sending incremental file list lect/ lect/README lect/advert-qm.tex # ... list of all files being transferred... # cool: SSH figures out host and user name! 7 8 9 sent 9,027,975 bytes received 6,995 bytes total size is 9,004,586 speedup is 1.00 286,824.44 bytes/sec After some editing and recompilation of the slides: I 1 2 3 4 5 6 $ rsync -av lect titan07: sending incremental file list lect/ lect/other-tools.tex lect/slides.pdf # ... fewer files being transferred... 7 8 9 sent 72,233 bytes received 3,430 bytes 50,442.00 bytes/sec total size is 9,008,764 speedup is 119.06 Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 238 14 · Other tools 14.4 Finding files · 14.4 Finding files find [option...] [path...] [expression] z Search the paths (or current directory) for files satisfying the expression. locate [option...] pattern... z Find files matching any of the patterns. I find(1) crawls the paths in file system. I locate(1) uses a database covering the whole system instead. • For each file encountered, the expression is evaluated. • • • • This is much faster! The database must be updated regularly by the admin, cf. updatedb(8). It will not find files added since the last update. Modern versions will only list files you have permission to see. Comparison find is much slower than locate, but provides more up to date information, and offers more control using an expressive syntax. ⇒ We focus on find. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 239 14 · Other tools Finding files · 14.4 Simple expressions -name pat -type t f d l z The file’s name matches the globbing pattern (cf. page 89). z File is of the following type: z regular file, z directory, z symbolic link. -user name z File is owned by user name. -size n z File size is n. A prefix of + finds greater files, - finds smaller files. Note: n is in 512B blocks, unless a suffix (k = 210 B, M = 220 B, G = 230 B) is used. Example 1 2 3 4 5 $ find studium -size +10k studium/inf3_w13/pub/pk_assignment02/texts/mark-twain.txt studium/inf3_w13/pub/pk_lecture01.pdf studium/inf3_w13/pub/.svn/pristine/f3/f379b24c7192a6f8da9f2f3acf779# ... studium/inf3_w13/pub/.svn/pristine/bf/bf55e06ef0379b33e8aaa59ccd99e# ... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 240 14 · Other tools Finding files · 14.4 Combining expressions I Expressions are evaluated following operator precedence, from left to right: Operators, listed in order of decreasing precedence: \( expr \) \! expr z For z “Not”, expr1 expr2 grouping of expressions. i.e., the outcome of expr is inverted. z “And”, expr1 -o expr2 expr2 is evaluated only if expr1 is true. z “Or”, expr2 is not evaluated if expr1 is true. Example 1 2 3 4 5 6 7 $ find studium/sq_w13 -size +10k \! -name '*pdf' studium/sq_w13/pub/putty.zip studium/sq_w13/pub/winscp.zip studium/sq_w13/pub/.svn/wc.db studium/sq_w13/pub/.svn/pristine/c1/c1ed62f0c0f7740aecb9dc88228# ... studium/sq_w13/pub/.svn/pristine/0f/0f4bb1c42aa86f0168a212cccc6# ... studium/sq_w13/pub/.svn/pristine/ac/ace54b9a541863e9d48a3e4f84f# ... Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 241 14 · Other tools Finding files · 14.4 Actions I Actions are expressions that perform an operation on a file. -prune z If file is a directory, do not descend into it (but list it). Evaluates to true. -print z Print the file’s name, and a newline. Evaluates to true. Note Filenames might contain newlines! cf. -print0 -delete z Remove the file. Evaluates to success of deletion. Example 1 2 3 $ find studium/sq_w13 -name .svn -prune -type f -o -size +10k \! -name \*pdf studium/sq_w13/pub/putty.zip studium/sq_w13/pub/winscp.zip Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 242 14 · Other tools Finding files · 14.4 Running commands I Some actions may run commands. -exec command [arg...] \; z Run command with the given arguments. Evaluates to success of command. The string {} in an argument is replaced with the filename. Note There are security risks when operating on unknown file systems! I See the info manual55 for a discussion. I Consider using -execdir instead of -exec, which rus in the file’s directory. I Make sure $PATH contains no relative directories. 55 $ info 'finding files' 'Security Considerations' Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 243 15 Text Encoding http://xkcd.com/927/ 15 · Text Encoding 15.1 What is plain text? · 15.1 What is plain text? Various definitions exist, and the notion of plain text is changing in the course of history... Plain text is a linear sequence of characters, independent of font, style, layout, coloring, ... — myself, 2013 Some weak points about this definition: I What are characters? I Are paragraphs layout? I Is the banana plain text? Stefan Klinger · DBIS 1 2 3 4 5 6 7 8 9 10 11 12 13 //\ From: Shimrod <[email protected]> V \ Newsgroups: alt.ascii-art \ \_ Date: Mon, 25 Aug 1997 16:53:13 +0200 \,’.‘-. |\ ‘. ‘. ( \ ‘. ‘-. _,.-:\ \ \ ‘. ‘-._ __..--’ ,-’;/ \ ‘. ‘-. ‘-..___..---’ _.--’ ,’/ ‘. ‘. ‘-._ __..--’ ,’ / ‘. ‘-_ ‘‘--..’’ _.-’ ,’ ‘-_ ‘-.___ __,--’ ,’ ‘-.__ ‘----""" __.-’ hh ‘--..____..--’ Key Competence in Computer Science · Winter 2015 245 15 · Text Encoding 15.2 I I The old days: 7-bit ASCII · 15.2 The old days: 7-bit ASCII Memory is physically made up of bits. More efficient to handle chunks (aka. bytes) of bits. Let’s say 7. This is a tradeoff between cost of memory/transmission, and expressiveness! I 7 bits can represent 27 = 128 values. I Associate a character with each of these values. ⇒ American Standard Code for Information Interchange — ASCII56 There’s a man page as well: ascii(7) I First 32 bytes (0..1F) represent various control characters (new line, carriage return, ...). Byte 7F (i.e., the last one) is Delete. I The remaining 95 bytes represent digits 0..9, characters a..z, A..Z, and basic punctuation. 56 Some nice history reading at https://en.wikipedia.org/wiki/ASCII Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 246 15 · Text Encoding 00 10 20 30 40 50 60 70 The old days: 7-bit ASCII · 15.2 0 1 2 3 4 5 6 7 8 9 A B C D E F NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US 0 @ P ‘ p ! 1 A Q a q " 2 B R b r # 3 C S c s $ 4 D T d t % 5 E U e u & 6 F V f v ‘ 7 G W g w ( 8 H X h x ) 9 I Y i y * : J Z j z + ; K [ k { , < L \ l | = M ] m } . > N ^ n ~ / ? O _ o DEL Sum up the row and column headers to find the hex code of a byte, e.g., m is at 6D. Problems with 7-bit ASCII I No characters for non-american languages: L, ä, é, , ... I No support for symbols: N, ⇒, ∂, 6=, ∈, ... I Inconsistent convention to denote end-of-line: Still inconsistent today! LF Unix/Linux traditionally uses line feed, CR some versions of Mac OS use carriage return, CR+LF Microsoft Windows even needs two bytes to denote EOL. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 247 15 · Text Encoding 15.3 Extended ASCII · 15.3 Extended ASCII I Most computers used an 8th bit for parity checking. I 8 bits can represent 28 = 256 different bytes. ⇒ Use the 8th bit to distinguish bytes, extending the character set. I ASCII-compatible extensions only redefine the “upper” bytes 80..FF. Unfortunately various different extensions appeared. I • The word code page (coined by IBM) refers to different mappings of the byte values to characters. • Popular in Western Europe: Code page 28591, aka. ISO Latin-1, aka. ISO 8859-1. • Code page 437 contains box-drawing characters. Problems I Many different, competing, incompatible mappings. I By looking at a sequence of bytes, it is difficult to determine what character set is being used. I Working on multilingual text requires handling of different code pages. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 248 15 · Text Encoding 15.4 Today: Unicode · 15.4 Today: Unicode I Initially57 (1988) designed as a 16 bit encoding, allowing for 65 536 different values (aka. code points in Unicode parlance). I Today, Unicode offers 1 114 112 code points (range U+0..U+10FFFF) • • • • I Notation: Unicode code point n is referred to as U+nhex Not all of the code points are being used (yet). The codespace is organised in 17 planes of 216 characters each. The basic multilingual plane, range U+0..U+FFFF, contains the most important characters for everyday use. The Unicode Standard58 also defines: • How to store Unicode text: Text encoding, cf. next slides. • Character properties (is it a whitespace, digit, diacritic, ...?) • How to handle characters like the German umlauts: As character ‘ä’ (U+E4), or as composition of basic ‘a’ (U+61) with diaeresis ‘¨’ (U+308). • ... 57 Joe 58 The Becker. Unicode 88. http://www.unicode.org/history/unicode88.pdf Unicode Consortium. http://www.unicode.org/ Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 249 15 · Text Encoding Today: Unicode · 15.4 Encoding Unicode Text How can one serialize a sequence of Unicode code points into a stream of bytes? I I This process is called text encoding. There is a variety of encodings, each with its own (dis)advantages. • • • • I UTF-8, the most commonly used one in the western world. UTF-16, UTF-32, and variations of these. UCS-1, UCS-2, and UCS-4 are outdated names of similar encodings. Punycode (used for domain names with umlauts). We will only cover UTF-32 and UTF-8. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 250 15 · Text Encoding 15.5 The UTF-32 encoding · 15.5 The UTF-32 encoding This fixed width encoding uses 4 bytes for each Unicode code point. Easy to find the nth code point in a text: At byte position 4 · n. Unfotunately, this is not necessarily the n character. Think of combining characters. Main usage is for processing (internal to programs), not for storage. ! At least 11 bits unused per code point (Unicode only uses 21). Space blowup, 4-times for ASCII data, 2-times for most other cases. ! Not self-synchronizing, i.e., looking at somewhere in the stream of bytes, it is not possible to tell whether this is the start of a code point! ! Byte ordering (aka. endianness ) becomes relevant. th 59 A byte-order-mark (BOM) stored at the beginning of a text file is necessary. glyph ASCII code point UTF-32 BE60 UTF-32 LE 59 cf. 60 BE A 41 U+41 00 00 00 41 41 00 00 00 λ U+3BB 00 00 03 BB BB 03 00 00 ⇒ U+21D2 00 00 21 D2 D2 21 00 00 BOM U+FEFF 00 00 FE FF FF FE 00 00 Jonathan Swift. Gulliver’s Travels. (Dispute about at which end to open an egg) = big endian, most significant byte first / LE = little endian Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 251 15 · Text Encoding 15.6 The UTF-8 encoding · 15.6 The UTF-8 encoding This variable width encoding uses 1..4 bytes for each code point. ASCII-compatible: Plain ASCII data looks the same when encoded as UTF-8 (unless a BOM is added). Lexicographic sorting yields the same order for strings of code points, and the UTF-8 encoded strings of bytes. UTF-8 is likely to be detected correctly, i.e., arbitrary data is unlikely to form correct UTF-8. UTF-8 is self-synchronizing. Space efficient for most European languages. No BOM required. BOM is discouraged in the UTF-8 definition, as its presence destroys some nice properties (think about #!). ! Some software refuses to work correctly without BOM. # Potential space blowup when encoding mainly Asian text, but still efficient when encoding HTML, which contains a lot of non-Asian code. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 252 15 · Text Encoding The UTF-8 encoding · 15.6 So how does UTF-8 work? I Charactes in the ASCII-range U+0..U+7F are encoded as themselves. • Each valid ASCII text also represents the same text when read as UTF-8. • Only the 7 lower bits are required for each of the ASCII characters, so the first bit is always 0. I Larger code points are represented by multi-byte sequences. • • • • All their bytes start with 1. Exactly the leading byte in a sequence starts with 11, while the followup bytes all start with 10. The number of 1s at the beginnig of the leading byte indicates the length of the byte sequence. range U+0..U+7F U+80..U+7FF U+800..U+FFFF U+10000..U+1FFFFF I byte 1 0 110 1110 11110 byte 2 10 10 10 byte 3 10 10 byte 4 10 bits 7 11 16 21 It would be possible to extend this scheme to even longer byte sequences, covering an even greater range of code points. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 253 15 · Text Encoding The UTF-8 encoding · 15.6 Examples glyph ASCII code point code point binary UTF-8 binary UTF-8 bytes glyph ASCII code point code point binary UTF-8 binary UTF-8 bytes I I L 4C U+4C π U+3C0 ⇒ U+21D2 1001100 01001100 11 11000000 11001111 10000000 100001 11010010 11100010 10000111 10010010 4C CF 80 E2 87 92 BOM U+FEFF U+10083 11111110 11111111 11101111 10111011 10111111 1 00000000 10000011 11110000 10010000 10000010 10000011 EF BB BF F0 90 82 83 An editor expecting Latin-1 will mis-interpret the UTF-8 encoded BOM as the three-character sequence “ı̈¿”. The Linear B ideograph (horse) uses four bytes in UTF-8, but the leading byte contains only 0-bits as payload. Question: Would 3 bytes be sufficient to encode in UTF-8? Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 254 15 · Text Encoding 15.7 I General problems with Unicode · 15.7 General problems with Unicode Unicode defines 8 different character sequences61 for end-of-line. • Includes LF, CR, and CR+LF. I Some characters appear twice with different semantics, e.g., the unit62 of length Å (U+212B), and the Swedisch letter Å (U+C5). I Some characters have different representations. E.g., the German umlaut ä, and most other accented letters: • As single composed character ä (U+E4), or • decomposed into character a (U+61) and combining diaeresis ¨ (U+308). I Character handling is difficult (and has security implications). ⇒ Use libraries, don’t do this yourself. 61 http://www.unicode.org/standard/reports/tr13/tr13-5.html 62 angstrom, named after Anders Jonas Ångström: 1Å = 10−10 m Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 255 15 · Text Encoding I General problems with Unicode · 15.7 Unicode63 blurs the line between representing text and typesetting: • Numbers in the range 1–20 with various decorations: E.g., encircled U+246C — in parenthesis U+2477 — with a period • Many ligatures, e.g., Latin Small Ligature Fi: U+2477. U+1F43C. ⇒ This makes it difficult e.g., to search for text. I Unicode even blurs the line between text and graphics: Panda Face U+FB01. Chipmunk U+1F43F. Pile Of Poo U+1F4A9. High-heeled shoe U+1F460. Money Bag U+1F4B0. Fax machine U+1F4E0. Man in business suit levitating (right). Raised hand with part between middle and ring fingers U+1F596. U+1F574 Note 63 all Don’t like Unicode? See the comic strip on page 244. sample renderings taken as SVG from the Unicode website Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 256 15 · Text Encoding 15.8 Text encoding in the wild · 15.8 Text encoding in the wild I Most software will handle text encoding automatically for you. I You should stick to the EOL-convention native to your system. I Checklist for data that leaves your box: • Encoding? • BOM? • EOL-convention? I A good text editor should offer means to handle different encodings, and EOL-conventions. I Some tools need help in determining where conversion is applicable. I If a file mentions its own encoding, make sure that’s true. • LATEX: \usepackage[utf8]{inputenc} • XML: <?xml version="1.0" encoding="UTF-8"?> • Python: # -*- coding: utf-8 -*- Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 257 15 · Text Encoding Text encoding in the wild · 15.8 File System Trouble 1 2 3 4 $ ls -l total 33k -rw------- 1 sk users 20k Aug 12 12:24 böse -rw------- 1 sk users 8.6k Aug 13 11:08 böse Whassup? I What is the problem? I How can we solve it? I We have actually seen this in 2013, when a Subversion checkout failed on MacOS, but worked well on Linux and Windows machines. Turned out that some student had committed two such files. I MacOS assumed both names to refer to the same file, telling Subversion that the file it wanted to write already exists... I Files whose names only differ in case cause the same trouble on MacOS. You can switch this off, making software fail which relies on that bug. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 258 15 · Text Encoding I I I 1 2 3 4 5 Text encoding in the wild · 15.8 Failed approach: Cut’n’paste with the mouse in the terminal window does not work for at least one of the files. Question: Why? Dirty-hack approach: Try to rename one random file, then use tab-completion to rename the other one. Enlightening approach: Figure out which bytes are being used: $ ls | file -b --mime # cf. file(1) text/plain; charset=utf-8 # so ls prints its output in UTF-8 $ ls | hexdump -e '16/1 "%02X " "\n"' -e '16/1 "%2_p " "\n"' # cf. hexdump(1) 62 6F CC 88 73 65 0A 62 C3 B6 73 65 0A # these are the hex values... b o . . s e . b . . s e . # ...of those bytes (. unless printable ASCII) UTF-8 byte sequence encodes code point which is 0A U+0A line feed CC 88 U+308 combining diaeresis We can address these files from the shell: 1 2 3 4 $ ls -l $'b\xC3\xB6se' -rw------- 1 sk users 8.6k Aug 13 11:08 böse $ ls -l $'bo\xCC\x88se' -rw------- 1 sk users 20k Aug 12 12:24 böse Stefan Klinger · DBIS C3 B6 U+F6 ö I When quoting with $'...' one may use C escape sequences. I Then, \xHH represents the byte with 2-digit hex code HH. I cf. section ANSI C Quoting in the bash manual. Key Competence in Computer Science · Winter 2015 259 15 · Text Encoding Text encoding in the wild · 15.8 Subversion I SVN can adapt files to your OSs native EOL convention. 1 2 $ svn ps svn:eol-style native solution.lhs property 'svn:eol-style' set on 'solution.lhs' • Always do this for collaboratively edited plain text files, • never do this for files in a format with a fixed EOL style. I You may have to tell subversion about the MIME type64 , if this is not detected correctly. Also useful for HTTP access. 1 2 3 4 I $ svn ps property $ svn ps property svn:mime-type application/pdf lecture.pdf 'svn:mime-type' set on 'lecture.pdf' svn:mime-type 'text/plain; charset=us-ascii' file.txt 'svn:mime-type' set on 'file.txt' See the Subversion manual on properties65 , and portability of files66 . 64 http://en.wikipedia.org/wiki/MIME_type 65 http://svnbook.red-bean.com/en/1.8/svn.advanced.props.html 66 http://svnbook.red-bean.com/en/1.8/svn.advanced.props.file-portability.html Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 260 15 · Text Encoding Text encoding in the wild · 15.8 Shell scripting I A shell script is identified by the shebang #! character sequence as the first two bytes, cf. page 155. I This is not compatible with the presence of a BOM. I Use an encoding (like UTF-8) that does not require a BOM. Windows users: Tell your editor not to store a BOM on such files. I In principle, all plain text formats with a magic number as first bytes suffer this limitation. Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 261 15 · Text Encoding Text encoding in the wild · 15.8 LAT EX I The package inputenc supports quite a bunch of encodings. I Only few Unicode characters are supported by default. I See the manual67 for more information. 1 2 Böse Überraschung? ä ö ü, Ä Ö Ü, ß 3 4 5 6 \documentclass{article} \usepackage[utf8]{inputenc} \begin{document} Böse Überraschung?\\ ä ö ü, Ä Ö Ü, ß \end{document} % Note 67 http://www.tug.org/texmf-dist/doc/latex/base/inputenc.pdf Stefan Klinger · DBIS Key Competence in Computer Science · Winter 2015 262 I The End http://www.gnu.org/graphics/meditate.html Image ©2001 Free Software Foundation, Inc. Available under GNU GPL and GNU FDL