Simple and flexible maps of geospatial data with
Transcription
Simple and flexible maps of geospatial data with
C ONTRIBUTED RESEARCH ARTICLE 1 Simple and flexible maps of geospatial data with mapmisc Patrick Brown Abstract The mapmisc package provides functions for visualising geospatial data, including fetching background map layers, producing colour scales and legends, and adding scale bars and orientation arrows to plots. Background maps are returned in the coordinate reference system of the dataset supplied, and inset maps and direction arrows reflect the map projection being plotted. This is a ‘light weight’ package having an emphasis on simplicity and ease of use. Introduction R has extensive facilities for managing, manipulating, and visualising spatial data, with the sp (Pebesma and Bivand, 2005) and raster (Hijmans, 2015) packages providing a set of object classes and core functions which numerous other packages have built on. It is fairly straightforward to import spatial data of a variety of types and from a range of sources including: images for map backgrounds; high-resolution pixel grids of surface elevation; and polygons of administrative region boundaries. Combining data from multiple sources will typically involve reconciling multiple coordinate reference systems, which is easily accomplished by using the rgdal package (Bivand et al., 2015). The RColorBrewer (Neuwirth, 2014) and classInt (Bivand, 2015) packages produce colours and intervals respectively for z-axis colour scales. A Unix-like structure of separate software facilities strung together gives R power and flexibility when used for visualising spatial data. This flexibility comes with the cost that relatively simple maps can often require a lengthy set of instructions which many users find difficult to master. The mapmisc provides a modest number of functions which together allow for maps to be produced by a small number of tidy lines of R code. The aim of mapmisc was originally to provide a simple and coherent set of resources which address some of the criticisms made (in this Author’s experience) by Geographical Information Systems (GIS) users stating that R’s maps are inadequate. The core features of mapmisc are functions for adding the following to any map, regardless of the coordinate reference system being used: • a contextual background map layer; • an inset map giving the position of the mapped area within a larger spatial region; • a scale bar showing distance; • an arrow pointing north, even if north is not ‘up’; and • a z-axis scale and legend using GIS-standard colours and categories. Accommodating different coordinate reference systems is a defining feature of mapmisc, and maps of projected data are no more difficult to produce than maps in longitude-latitude coordinates. A secondary aim of mapmisc is to streamline the writing of map-making code, reducing both the drudgery involved and the potential for coding errors. The requirement here was to have teaching slides in Sweave (Leisch, 2002) or knitr (Xie, 2014) where a code block is not larger than the map it produces. Undergraduate students have used mapmisc for course assignments, and non-statistical (but R-literate) collaborators have used and modified code scripts involving mapmisc. A third goal of mapmisc is to enable and encourage the use of customised Coordinate Reference Systems (CRS’s). Spatial statisticians should not hesitate to rotate data, choose a non-standard origin, or use an axis other than a meridian line if it benefits them. mapmisc can assist in finding the CRS with the smallest bounding box of the dataset in question, or the CRS where Euclidean distance gives the closest approximation to distance on the sphere. Installing required packages The two most important packages required for using spatial data in R are the sp and raster packages, which provided tools and classes for vector data (spatial data on a continuous domain) and raster data (defined on a pixelated grid) respectively. Installing mapmisc with install.packages('mapmisc') or by using a menu item on a GUI will install sp and raster if they are not already present. The RColorBrewer and classInt not installed automatically with mapmisc , but are strongly recommended although not strictly necessary. The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 2 The rgdal package is not straightforward to install on all systems, so although it is essential for most purposes it is not a requirement and is must be installed separately. Running install.packages('rgdal') will work on most MS Windows systems and many Mac OS computers, most Linux distributions and many Mac’s require the GDAL and PROJ.4 software to be pre-installed separately. Installing the packages libproj-dev and libgdal-dev through a package manager (such as synaptic) is usually sufficient for Linux systems. Mac systems where install.packages('rgdal') fails (currently Mavericks systems) are the most problematic. Installers for GDAL can be found on http://www.kyngchaos.com/software/frameworks the ‘GDAL Complete’ package should be sufficient. Users familiar with the Xcode command line tools may prefer to install via http://www.macports.org or http://brew.sh. Once GDAL has been installed, the R package can be installed with a command similar to the following. install.packages('rgdal', type = "source", configure.args=c('--with-proj-include=/usr/local/include', '--with-proj-lib=/usr/local/lib')) Load the packages with library("rgdal") ## Loading required package: sp ## rgdal: version: 0.9-3, (SVN revision 530) ## Geospatial Data Abstraction Library extensions to R successfully loaded ## Loaded GDAL runtime: GDAL 1.11.2, released 2015/02/10 ## Path to GDAL shared files: /usr/share/gdal/1.11 ## Loaded PROJ.4 runtime: Rel. 4.9.1, 04 March 2015, [PJ_VERSION: 491] ## Path to PROJ.4 shared files: (autodetected) ## Linking to sp version: 1.1-0 library("mapmisc") ## Loading required package: raster ## ## Attaching package: 'raster' ## ## The following objects are masked from 'package:Hmisc': ## ## mask, zoom which also makes sp and raster available. Getting started with spatial data in R The getData function provided by raster is able to download a number of useful and interesting spatial datasets. The coastline and borders of Finland can be fetched with finland = raster::getData("GADM", country = "FIN", level = 0) The object finland is a SpatialPolygonsDataFrame, and Bivand et al. (2013) contains a wealth of information on working with objects of this type. The command plot(finland, axes = TRUE) produces the plot in Figure 1a. The choice of Finland as an example is due to it’s being far from the equator, and a useful contrast on the other side of the world is New Zealand. The coastline of New Zealand, obtained with nz = raster::getData("GADM", country = "NZL", level = 0) includes a number of small outlying islands. The spatial extent of the nz object, extent(nz) ## class ## xmin : Extent : -179 The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE ## xmax ## ymin ## ymax 3 : 179 : -52.6 : -29.2 spans the entire globe since one of these islands is on the opposite side of the 180◦ meridian. Finding an appropriate axis limit through trial and error brings one to plot(nz, xlim = c(167, 178), axes = TRUE) and the map in Figure 1b. Coordinate Reference Systems The spatial coordinates in Figures 1a and 1b are in longitudes and latitudes, or, to use mathematical nomenclature, spherical coordinates. The difficulty spherical coordinates pose for spatial statisti√ cians is that Euclidean distance x2 + y2 is not always a useful measure of the distance separating two points. The shortest route between two points on a sphere follows a Great Circle which divides the sphere into two equal halves. The distance between two points along this path, Great Circle Distance (see Wikipedia, 2015a), can be computed with a trigonometric formula as implemented in the spDists function in sp. Euclidean distance will be roughly proportional to Great Circle Distance for two points near the equator and close to one another. In Finland and New Zealand, however, Euclidean distance will overemphasise the east-west direction since one degree of longitude is a much shorter distance (in kilometres) than one degree of latitude. It is for this reason that Greenland appears larger than India on many maps even though the opposite is true. Most R packages which perform spatial analyses rely on Euclidean distance, including this author’s own geostatsp (Brown, 2015), even though Great Circle Distance would be straightforward to implement. The importance of transforming spherical coordinates to a coordinate system where Euclidean distance is a reasonable approximation to Great Circle Distance should not be underestimated. A large number of Coordinate Reference Systems (CRS’s) are in use, and Munroe (2011) is a useful introduction to some of the most popular ones. The Finland Uniform Coordinate System, having European Petroleum Standards Group code 2393, is a Transverse Mercator projection (see Wikipedia, 2015b) which x and y coordinates giving positions on a cylinder containing the earth. This projection is obtained from rgdal with CRS("+init=epsg:2393") ## ## ## ## ## CRS arguments: +init=epsg:2393 +ellps=WGS84 +proj=tmerc +lat_0=0 +lon_0=27 +k=1 +x_0=3500000 +y_0=0 +towgs84=-96.062,-82.428,-121.753,4.801,0.345,-1.376,1.496 +units=m +no_defs The entry +lon_0=27 indicates that the cylinder touches 27◦ meridian line at every point. Provided two points are reasonably close to the 27◦ meridian, Euclidean distance between their Finish Uniform Coordinates will be very close to the true distance between them. The map of Finland can be converted to this coordinate system using spTransform from rgdal with finlandMerc = spTransform(finland, CRS("+init=epsg:2393")) Figure 1c is the result of plotting this object with plot(finlandMerc,axes=TRUE). Notice the projected map has a wider base and narrower top than the long-lat map in Figure 1a. The coordinates in Figure 1c refer to an origin where the 27◦ meridian intersects the equator, with the +x_0= argument above indicating that 3,500km is added to the x coordinates. Map projections can be constructed from any cylinder containing the Earth, and there is no mathematical requirement to stick to a standard projection with an epsg: identifier. The point (170◦ E, 45◦ S) is an intuitive the location of the origin for a transformed map of New Zealand, being near the centre of the south island. The cylinder can follow any Great Circle, it need not be a meridian line, and a Great Circle angled 40◦ clockwise would run the length of the two islands. Cylindrical projections with an angle are termed Oblique Mercator Projections (see Snyder, 1987, page 66), and can be constructed with mapmisc’s omerc function. This New Zealand projection is obtained by nzCrs = omerc(c(170, -45), angle = 40) nzCrs The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 (b) New Zealand, Longitude-Latitude 6e+05 2e+05 −2e+05 50°S (a) Finland, Longitude-Latitude 165°E 170°E 175°E 180° 6600000 45°S 7000000 40°S 7400000 35°S 68°N 66°N 64°N 62°N 60°N 20°E 26°E 28°E 30°E 32°E 22°E 24°E 1e+06 7800000 4 30°S 70°N C ONTRIBUTED RESEARCH ARTICLE 3200000 3500000 (c) Finland Transverse Mercator −6e+05 −2e+05 2e+05 (d) New Zealand, Oblique Mercator Figure 1: Basic maps of Finland and New Zealand in different Coordinate Reference Systems The difference between the above and the projection for Finland is the omerc in place of tmerc, with the additional argument +alpha=40 specifying an angle. The New Zealand coastline can be projected to this CRS with spTransform. nzRot = spTransform(nz, nzCrs) The small outlying islands can be cropped from the New Zealand boundary using the gIntersection function in the rgeos package (Bivand and Rundel, 2014). The main islands are located between -600000 and 50000 in the x direction and -360000 and 1500000 in the y direction, values obtained by plotting the transformed polygon with plot(nzRot) and clicking on the map after running the locator function. Using raster’s extent function is the simplest way to create a rectangular polygon from these coordinates, and the cropping is accomplished with forClip = as(extent(-6e+05, 5e+05, -360000, 1500000), "SpatialPolygons") crs(forClip) = crs(nzRot) nzClip = rgeos::gIntersection(nzRot, forClip) Figure 1d results from executing plot(nzClip,axes=TRUE), and New Zealand has been rotated 40◦ to a vertical position. Mapping with mapmisc Figure 2 contains the maps of Finland and New Zealand from Figures 1c and 1d, where mapmisc has been used to add background images, a scale bar, and an inset map. Backgrounds images are obtained from the openmap function, which downloads map tiles from http://openstreetmap.org and other sources. The images used in Figure 2 were obtained with nzBg = openmap(nzClip, path = "mapquest-sat") finlandBg = openmap(finlandMerc, path = "landscape") The first argument of openmap is used to obtain the spatial extent of the map to retrieve and the CRS the map will be projected to. Any spatial object x for which extent(x) and projection(x) are defined can be provided. The path= argument specifies the source of the map, and sample maps from the various sources are shown in Figure 9 in the Appendix. The level of detail in the map is specified by providing either a zoom level (zoom=) or a maximum number of tiles (defaults to maxTiles=9) for which a corresponding zoom level is found. The objects produced by openmap are rasters, converted from image files downloaded from map tile servers. The Finland map is a RasterLayer finlandBg ## ## ## ## ## ## class dimensions resolution extent coord. ref. data source : : : : : : RasterLayer 822, 670, 550740 (nrow, ncol, ncell) 2130, 2130 (x, y) 2505872, 3932972, 6172770, 7923630 (xmin, xmax, ymin, ymax) +init=epsg:2393 +proj=tmerc +lat_0=0 +lon_0=27 +k=1 +x_0=3500000 +y_0=0 +ellps=intl +towgs84=-96.062,-82.428,-121.753,4.801 in memory The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE ## names ## values 5 : landscape : 1, 1340 (min, max) notice the CRS is the same as the finlandMerc object. The New Zealand map is a RasterStack with red, green and blue layers. class(nzBg) ## [1] "RasterStack" ## attr(,"package") ## [1] "raster" names(nzBg) ## [1] "mapquest.satRed" "mapquest.satGreen" "mapquest.satBlue" The Finland map can be viewed with plot(finlandBg), whereas the three-layered New Zealand map is shown with plotRGB(nzBg). A third map required for the plots in Figure 2 is the inset map on the right. The code below retrieves a map where New Zealand (at 170◦ E and 45◦ S) is in the south-west corner, the northern limit of Finland (roughly 70◦ ) fits within the map, and the western extremity of the map stops just short of London. eurasia = openmap(extent(0, 170, -45, 70), path = "osm-no-labels") Figure 2a is produced with the code below. map.new(finlandMerc, 0.7) plot(finlandBg, add = TRUE) plot(finlandMerc, add = TRUE) scaleBar(finlandMerc, "bottomright") insetMap(finlandMerc, "right", map = eurasia, width = 0.45) The functions run are the following: • map.new initialises a new plot area suitable for showing finlandMerc. The second argument of 0.7 specifies that the left 70% of the plot will contain the map and the right 30% will be left for legends or inset maps. • plot(finlandBg,add=TRUE) adds the background map to the existing plot. Reprojecting the map to the Finish Uniform Coordinate System has distorted the letters and much of the text is illegible. • plot(finlandMerc,add=TRUE) adds the Finland boundary. • scaleBar produces the 200km scale and north arrow in the bottom right of the map. Although scaleBar can deduce the coordinates of the mapped region, it must be told the CRS of the plot by passing the finlandMerc object. A CRS object or character string could also have been supplied. • insetMap produces the small map to the right, and as with scaleBar it must be told the CRS of the map. Shown in red in the inset map is the area mapped in the plot. The width argument specifies the width of the inset map as a fraction of the plotting region. The New Zealand map in Figure 2b is produced with similar code. map.new(nzClip, 0.7) plotRGB(nzBg, add = TRUE) plot(nzClip, add = TRUE, border = "red") scaleBar(nzClip, "bottomright", seg.len = 6) insetMap(nzClip, "right", map = eurasia, width = 0.45) The use of plotRGB in place of plot for the background map is the only noticeable difference. The map images were retrieved from OpenStreetMap.org and Mapquest, and although they are free to use they must be attributed. The openmapAttribution function produces appropriate attribution from an object produced by openmap or the name of a tile path. The captions in Figure 2 were produced with the assistance the following. openmapAttribution(nzBg, short = TRUE) ## mapquest.sat ## "Tiles courtesy of MapQuest(http://www.mapquest.com)" The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE (a) Finland. Background ©OpenStreetMap 6 (b) New Zealand. Tiles courtesy of MapQuest and ©OpenStreetMap Figure 2: Maps produced using the mapmisc package, containing background images, inset maps, and scale bars. The Acknowledgements section of this paper has used this function without the short=TRUE argument. strwrap(openmapAttribution(finlandBg), 70) ## ## ## ## [1] [2] [3] [4] "copyright OpenStreetMap.org contributors. Data by OpenStreetMap.org" "available under the Open Database License" "(opendatacommons.org/licenses/odbl), cartography is licensed as CC" "BY-SA (see www.openstreetmap.org/copyright)." The additional argument type='latex' is used in the source code for this paper. Colour scales The colourScale and legendBreaks functions in mapmisc are able to create and display sequences of ‘z-axis’ intervals or bins with assigned colours for mapping spatial data. The RColorBrewer package gives R users access to the popular ColorBrewer collection of pallets, all of which are displayed in Table 2 in the Appendix. A number of straightforward but tedious steps are involved in creating suitable bins, assigning colours to data points based on these bins, specifying transparency when overlaying colours on background maps, and displaying the intervals and colours in a legend. The process has been simplified as much as possible in mapmisc, although at least five separate lines of R code are required to produce a single map. Points and polygons Consider, as a motivating example, the task of visualising the spatial variation in fertility rates in Europe using data available from the European Union. Eurostat divides the continent into a large number of NUTS (an acronym in no need of a definition) and makes social and demographic data on the NUTS’s available at http://ec.europa.eu/eurostat/data/database. Boundaries can be obtained from http://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units . Figure 3 shows the NUTS-level fertility rate in 2010 for the 309 NUTS. Code in the Appendix for downloading and merging the boundary files and fertility rates produces a SpatialPolygonsDataFrame object euroF. Each polygon in the object is a NUTS, and fertility rates for each NUTS are provided for the years 2001 to 2012 inclusive. euroF The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE ## ## ## ## ## ## ## ## class features extent coord. ref. variables names min values max values : : : : : : : : 7 SpatialPolygonsDataFrame 309 -24.5, 44.8, 34.6, 71.2 (xmin, xmax, ymin, ymax) +proj=longlat +ellps=GRS80 +no_defs 16 NUTS_ID, STAT_LEVL_, SHAPE_Leng, SHAPE_Area, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, ... AT11, 2, 0.134, 1.00e+00, 0.88, 0.86, 0.91, 0.93, 0.96, 0.98, 1.02, 1.08, 1.08, 1.04, 1.05, ... UKN0, 2, 9.942, 9.96e-01, 2.07, 2.06, 2.09, 2.05, 2.06, 2.12, 2.09, 2.27, 3.79, 3.77, 3.78, ... The polygons are transformed to the European Terrestrial Reference System (EPSG code 3034) below. euroF = spTransform(euroF, CRS("+init=epsg:3034")) The legend in Figure 3 has eight colours, uses the ‘Spectral’ palette from RColorBrewer, and has the order of colours reversed so that blues are low values and reds are large values. This colour scale was produced with the colourScale function by using the breaks=8, col='Spectral' and rev=TRUE arguments. ecol = colourScale(euroF[["2010"]], breaks = 8, col = "Spectral", rev = TRUE, style = "jenks", dec = 1, opacity = 0.5) The first argument specifies that the 2010 column of the data frame in euroF contains the data of interest. The style argument controls how the breaks are computed, and the 'jenks' option used here results in a call to the classIntervals function in the classInt package. The break points are rounded to one decimal place because of the dec=1 argument, and opacity=0.5 option gives the plotted colours 50% transparency. The result of a call to colourScale is a list containing the break points for bins (breaks), colours associated with the bins (col), and a vector of length 309 (plot) with one colour per NUTS. names(ecol) ## [1] "col" "breaks" "colOpacity" "plot" ecol$breaks ## [1] 1.0 1.3 1.5 1.7 1.9 2.0 2.4 2.9 3.8 ecol$col ## [1] "#3288BD" "#66C2A5" "#ABDDA4" "#E6F598" "#FEE08B" "#FDAE61" ## [7] "#F46D43" "#D53E4F" length(ecol$plot) ## [1] 309 The breaks and col elements are used for displaying a legend, whereas the plot element can be passed as a col= argument when running plot on a SpatialPoints or SpatialPolygons object. A background map must be fetched before producing the plot, and the 'osm-no-labels' tiles which contain no writing or labels is obtained below. euroMap = openmap(euroF, maxTiles = 15, path = "osm-no-labels") Country names will be added to the map separately using the text function, using data downloaded with raster’s getData function. world = raster::getData("countries") Some of the special characters in the country names will cause problems when an R session has not had Unicode support enabled, and the interests of portability the names are encoded to a simpler character set. world$NAME = iconv(as.character(world$NAME), "utf8", "latin1") The world object is a SpatialPolygonsDataFrame, but the text function used to add text to a plot requires a single point rather than a polygon for positioning each country’s name. Converting world to a SpatialPointsDataFrame with the code below will convert each country’s boundary polygon to a single point at it’s centroid. The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 8 fertility 3.8 2.9 2.4 2 1.9 1.7 1.5 1.3 1 Iceland Finland Norway Sweden Estonia Latvia Denmark Ireland Lithuania Belarus United Kingdom Netherlands Belgium Germany Poland Czech Republic Slovakia France Switzerland Kazakhstan Austria Hungary Ukraine Moldova GeorgiaAzerbaijan Romania Croatia Bosnia Serbia Italy Herzegovina Bulgaria Turkey Portugal Spain Greece Syria Iraq 1000km Jordan Figure 3: Fertility in Europe by NUTS. Background ©OpenStreetMap, ©EuroGeographics for the administrative boundaries worldP = SpatialPointsDataFrame(world, proj4string = CRS(projection(world)), data = data.frame(world)[, c("NAME", "SQKM")]) Only the country’s name and surface are required, and other data columns have been dropped. To prevent the map being crowded, all countries with a surface are less than 30,000km2 are removed and spaces in country names are converted to line breaks. Finally, the object is reprojected to the same CRS as the fertility dataset. worldP = worldP[worldP$SQKM > 30000, ] worldP$NAME = gsub("[[:space:]]", "\n", worldP$NAME) worldE = spTransform(worldP, crs(euroF)) Figure 3 is produced with the code below. Notice the col=ecol$plot argument specifying the colours with which each region is filled, and that the borders of each region were given 85% transparency by border=col2html('black',0.15). The text function has added country names to the plot. map.new(euroF) plot(euroMap, add = TRUE) plot(euroF, col = ecol$plot, add = TRUE, border = col2html("black", 0.15)) legendBreaks("topright", ecol, title = "fertility", bg = "white") text(worldE, labels = worldE$NAME, cex = 0.8) scaleBar(euroF, "bottom", bty = "n") The legend on the right is added with legendBreaks, which passes most of its arguments to the standard legend function. The role of legendBreaks is to ensure the 9 numeric break points are printed near the boundaries between the coloured squares (as opposed to aligned with their centres). Using the style='jenks' argument for colourScale triggers a clustering algorithm in classIntervals which can be time consuming for large datasets. The 'quantile' and 'equal' arguments use quantiles and equally spaced break points respectively, with the latter able to have breaks equally spaced on a transformed scale (i.e. log or square root). Some examples appear below. colourScale(euroF[["2010"]], breaks = 8, style = "quantile", dec = 1)$breaks ## [1] 1.0 1.4 1.5 1.7 1.9 2.0 3.8 colourScale(euroF[["2010"]], breaks = 8, style = "equal")$breaks The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 9 ## [1] 1.04 1.43 1.82 2.21 2.60 2.99 3.38 3.77 colourScale(euroF[["2010"]], breaks = 8, style = "equal", transform = "log")$breaks ## [1] 1.04 1.25 1.50 1.81 2.17 2.61 3.14 3.77 colourScale(euroF[["2010"]], breaks = 8, style = "equal", dec = 0)$breaks ## [1] 1 2 3 4 The final set of break points has a dec=0 argument, and although 8 breaks have been requested only 4 unique breaks remain after rounding to the nearest integer. Rasters Figure 4 shows two elevation maps of New Zealand produced with the assistance of the colourScale and legendBreaks functions. The elevation data is obtained using the raster package’s getData function. nzAlt = raster::getData("alt", country = "NZL") The nzAlt object is a list of two elements, with elevation in two disjoint areas which comprise New Zealand. The first element includes the main island, its CRS is incomplete (at the time of writing) and should be re-specified. nzAlt = nzAlt[[1]] crs(nzAlt) = CRS("+init=epsg:4326") This raster includes a number of outlying islands, and its size becomes more manageable after the main island is extracted using the previously computed nzClip object. dim(nzAlt) ## [1] 2244 1572 1 nzAlt = raster::crop(nzAlt, extent(spTransform(nzClip, crs(nzAlt)))) dim(nzAlt) ## [1] 1548 1458 1 Note that nzClip was transformed to the same CRS as the raster before its extent was used for cropping. This smaller raster is now reprojected to the rotated Oblique Mercator projection used earlier. nzAltRot = projectRaster(nzAlt, crs = crs(nzClip)) dim(nzAltRot) ## [1] 2424 3078 1 Reprojecting the raster has made it larger, as it needs a rectangular bounding box large enough to contain the rotated rectangle which was the original raster. Many of the entries in nzAltRot are NA, however, as in this projection the main islands are contained within a small bounding box in the centre. The trim function removes the outer rows and columns of the raster which contain only missing values. nzAltRot = raster::trim(nzAltRot) nzAltRot ## ## ## ## ## ## ## ## class dimensions resolution extent coord. ref. data source names values : : : : : : : : RasterLayer 1933, 1255, 2425915 (nrow, ncol, ncell) 552, 727 (x, y) -561568, 131192, -321367, 1083924 (xmin, xmax, ymin, ymax) +proj=omerc +lat_0=-45 +lonc=170 +gamma=0 +k=1 +alpha=40 +x_0=0 +y_0=0 +ellps=WGS in memory NZL1_msk_alt -15.8, 3028 (min, max) The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 10 Since Raster objects are typically large, with nzAltRot having over 2 million entries, it is convenient that objects in the raster package provide easy access to the maximum and minimum values (−15.85 and 3028.109 above). The use of the style='equal' argument is recommended with rasters, as is done below. nzAltCol = colourScale(nzAltRot, breaks = 7, col = terrain.colors, style = "equal", dec = -2) Here the col= argument was given the terrain.colors function, and any function accepting a single integer argument and returning a vector with the specified number of colours would suffice. Below a second colour scale is computed using the 'OrRd' colour palette and a supplied vector of breaks. nzAltTrans = colourScale(nzAltRot, breaks = c(-20, 100, 500, 1200, 2000, 3100), col = "OrRd", style = "fixed", dec = -1, opacity = c(0.2, 1)) Here the opacity argument has been given a vector of length 2, giving the opacity of the first colour and last colour respectively with intervening colours having opacities which are linear interpolations of these values. Figure 4a is produced with code below. No background map is present, though the background has been set to a sea-like colour. The important line of code here is the one beginning with plot(nzAltRot,... which adds the elevation data to the map. The col and breaks elements of the colour scale nzAltCol are passed as identically named arguments of the plot method from the raster package. The legend=FALSE argument prevents plot from adding its own legend, which would not fit well with this map as it must appear on the right. par(bg = "lightblue1") map.new(nzClip) par(bg = "white") plot(nzAltRot, col = nzAltCol$col, breaks = nzAltCol$breaks, legend = FALSE, add = TRUE) plot(nzClip, add = TRUE, border = col2html("red", 0.4)) legendBreaks("bottomleft", nzAltCol, title = "metres", bg = "white") scaleBar(nzClip, "left") The code for Figure 4b is below. Here the colOpacity element of the colour scale, which has 2-digit opacity levels appended to each specified colour, is passed as col= argument to allow the satellite photo beneath to be viewed. map.new(nzClip) plotRGB(nzBg, add = TRUE) plot(nzAltRot, col = nzAltTrans$colOpacity, breaks = nzAltTrans$breaks, legend = FALSE, add = TRUE) legendBreaks("bottomleft", nzAltTrans, title = "metres") scaleBar(nzClip, "left") Categorical data Maps of categorical data (or factor’s in R parlance) assign colours to categories rather than intervals, and the breaks= argument of colourScale governs which of the categories will have colours assigned to them. Figure 5 maps different land type categories in Africa, and although over 20 land categories are present in the dataset only 10 are displayed. The land cover data originate from the European Space Agency, and code in the Appendix downloads a convenient repackaging of these data from http://www.worldgrids.org/doku.php?id=wiki: glcesa3. Two files are retrieved, the first being a tif raster file where each cell contains an integer specifying an identifier for a land type category, and the second a text file giving a description of the land type category to which each code is assigned. The tif file containing the spatial data is imported into Ras a RasterLayer object with code similar to land=raster('myfile.tif'). This raster object spans the entire globe. To extract the central section of Africa from the raster, the countries of Liberia and Tanzania are subset from the previously downloaded world object and reprojected to the same CRS as the raster. worldSub = world[grep("Liberia|Tanzania", world$NAME), ] worldSub = spTransform(worldSub, crs(land)) A raster containing the land types for Liberia, Tanzania, and all places in between is cropped next. The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 11 (a) terrain.colors() (b) "OrRd" colours with opacity. Tiles courtesy of MapQuest Figure 4: Elevation map of New Zealand landSub = raster::crop(land, extend(extent(worldSub), 5)) The extend function has added an additional 5 units (in this case degrees latitude and longitude) to the region to be extracted. The file describing land categories is imported into R as a data.frame named landTable. landTable[c(1:3, 20:23), ] ## ## ## ## ## ## ## ## 1 2 3 20 21 22 23 ID label 11 No data (burnt areas,\n clouds,) 14 Rainfed croplands 20 Mosaic cropland 200 Bare areas 210 Water bodies 220 Permanent snow and ice 230 No data (burnt areas,\n clouds,) This table is provided to colourScale alongside the landSub raster as a levels= argument. Here levels must be a data frame with columns named ID and label giving identifiers and descriptions respectively. Below 10 breaks have been requested, with style='unique' indicating that the data are categorical rather than continuous. The 10 most common land types will be assigned colours, although the exclude= argument specifies that several categories ( i.e. 11 no data, 210 water bodies, 200 bare areas) will not be colour-coded regardless of how prevalent they are. The 'Set3' colour scale from RColorBrewer is used. landLevels = colourScale(landSub, breaks = 10, style = "unique", col = "Set3", exclude = c(11, 210, 220, 230), labels = landTable) names(landLevels) ## [1] "col" "breaks" ## [5] "colortable" "levels" The R Journal Vol. XX/YY, AAAA 20ZZ "colOpacity" "colourtable" "legend" ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE broadleaved forest regularly flooded (semi−permanently or temporarily) Closed broadleaved deciduous forest herbaceous vegetation (grassland, savannas or lichens/mosses) broadleaved evergreen or semi−deciduous forest (broadleaved or needleleaved, evergreen or deciduous) shrubland Mosaic vegetation (grassland/shrubland/forest) Mosaic forest or shrubland Mosaic cropland Open broadleaved deciduous forest/woodland Rainfed croplands 12 Figure 5: Land categories in central Africa. Background ©Stamen Design As a result of the levels= being provided, the colourScale function has provided additional outputs of colourtable and levels to be added to the landSub raster. Being Canadian, this author is habituated to using two spellings of ’colour’ interchangeably and has provided the colourtable object twice. This object is a vector of colours associated with each numeric category, and the raster package uses American spelling to specify it’s place in a raster. landSub@legend@colortable = landLevels$colourtable The levels element of landLevels is a data frame with one row per category and including labels and colours for each. It must be enclosed in a list when being added to a raster as follows. levels(landSub) = list(landLevels$levels) Two background images are included in Figure 5, one for the main plot and another for the inset map. landMap = openmap(landSub, path = "stamen-watercolor") worldMapAf = openmap(landSub, path = "osm-no-labels", zoom = 2) The landSub raster, having had landLevels$colortable and landLevels$levels attached to it, is the only object required of the plot and legendBreaks functions. map.new(worldSub) plotRGB(landMap, add = TRUE) plot(landSub, add = TRUE) insetMap(landSub, "topright", map = worldMapAf) legendBreaks("bottomleft", landSub, ncol = 2, inset = 0, bg = "white") The ncol=2 argument to legendBreaks refers to two columns (rather than two colours) and is passed directly to legend. An alternative to including the legend on the plot is to create an “in-line” legend, possibly in the figure caption. The legendTable function assists in this task, with the code cat(legendTable(landSub, collapse = "; ")) after compilation producing: Rainfed croplands ( ); Mosaic cropland ( ); Mosaic vegetation (grassland/shrubland/forest) ( ); broadleaved evergreen or semi-deciduous forest ( ); ); Open broadleaved deciduous forest/woodland ( ); Closed broadleaved deciduous forest ( Mosaic forest or shrubland ( ); (broadleaved or needleleaved, evergreen or deciduous) shrubland ( ); herbaceous vegetation (grassland, savannas or lichens/mosses) ( ); broadleaved forest ) . Table 1 is created with the following. regularly flooded (semi-permanently or temporarily) ( Hmisc::latex(legendTable(landSub, type='latex'), file='',rowname=NULL, where='hb', caption.loc='bottom', colheads=FALSE, caption='Land categories in Figure \\ref{fig:plotLand}') The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 13 Optimised map projections This section describes the use of the omerc function for defining Oblique Mercator projections. Two methods of optimising map projections are implemented, with the angle of rotation chosen to produce the most compact bounding box possible or to preserve distances amongst a collection of points. Smallest bounding box Figure 6 shows New Zealand rotated in a manner optimised to produce the smallest possible bounding box (blue rectangle) and consequently the bounding box with the highest proportion of its contained area made up of the land mass of New Zealand. Using this projection minimises the number of square cells a grid covering New Zealand would require in order to achieve a given spatial resolution, which would provide a computational advantage when using a statistical model which operates on a grid. The projection also minimises the number of vertical inches which Figure 6 takes up on the page, which is the reason a clockwise rotation is used in place of the anti-clockwise rotation from Figure 2b. This projection is obtained by passing a vector of rotation angles to the omerc function. The first argument below is the nzClip object, and the origin of the Oblique Mercator projection will be its centroid. A sequence of Oblique Mercator projections using Great Circles angled between 10 and 50 degrees from the north will be formulated, and the bounding box of New Zealand in each projection will be computed. The projection returned by omerc uses the angle giving the smallest such box, with the argument +alpha= showing a 22◦ angle in this example. nzRotOptCrs = omerc(nzClip, seq(10, 50, by = 0.5), post = "wide") nzRotOptCrs ## CRS arguments: ## +proj=omerc +lat_0=-41.1293043007557 +lonc=170.967146720823 ## +alpha=22 +k=1 +x_0=0 +y_0=0 +gamma=90 +ellps=WGS84 +units=m A positive angle rotates the coordinate axes clockwise, which gives the resulting map the appearance of being rotated anti-clockwise. An Oblique Mercator’s Great Circle becomes the y-axis of the coordinate system, and a projection should seek to have as much of the area of interest as possible being close to this Great Circle. An optimal projection will therefore have a tall and narrow bounding box, in this instance the spine of New Zealand should run up and down the y-axis. The limits of 10 and 50 for the sequence of angles above are rough estimates of the amount of anti-clockwise rotation required to sit New Zealand upright. The argument post='wide' in omerc specifies that a short and wide bounding box is required, and this is accomplished by rotating the planar x-y coordinates post-projection. The +gamma=90 element of the CRS is where this post-projection rotation is specified. New Zealand is reprojected and a new background map in this projection is produced below. nzRotOpt = spTransform(nzClip, nzRotOptCrs) nzBgRot = openmap(nzRotOpt, path = "cartodb", extend = 20000) Figure 6 is produced with the following. Rainfed croplands Mosaic cropland Mosaic vegetation (grassland/shrubland/forest) broadleaved evergreen or semi-deciduous forest Closed broadleaved deciduous forest Open broadleaved deciduous forest/woodland Mosaic forest or shrubland (broadleaved or needleleaved, evergreen or deciduous) shrubland herbaceous vegetation (grassland, savannas or lichens/mosses) broadleaved forest regularly flooded (semi-permanently or temporarily) Table 1: Land categories in Figure 5 The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 14 map.new(nzRotOpt, 0.8) plot(nzBgRot, add = TRUE) plot(nzRotOpt, add = TRUE, border = "red") plot(extent(nzRotOpt), add = TRUE, col = "blue") scaleBar(nzRotOpt, "bottomright", bty = "n") insetMap(nzRotOpt, "topright", map = eurasia, width = 0.275) Figure 6: New Zealand in a 22◦ Oblique Mercator projection ( — ) with bounding box ( — ). Background ©CartoDB , ©OpenStreetMap Preservation of distances The motivation given in the Introduction for the use of map projections was to enable the use of Euclidean distance on geographic data. The appearance of Canada on maps is particularly sensitive to the CRS used, being a large country at high latitude. Figure 7 shows Canada in four different CRS’s, the first of which is an Oblique Mercator optimised to preserve distances between 12 provincial and territorial capital cities. Unlike the New Zealand projection, however, the x-y coordinates on the Mercator’s cylinder are inverse-rotated to preserve the north-south direction as up-down. Oblique Mercator projections are defined by an origin and an angle of rotation, only the latter is optimised by the omerc function. Here we will set the origin somewhat arbitrarily as the town of Flin Flon, Manitoba, as an alternative to the centroid-based origin used earlier. Residents of Toronto, this author included, affectionately refer to their city as ‘The Centre of the Universe’ 1 , but Flin Flon is a more suitable Oblique Mercator centroid for two reasons. First, Flin Flon’s latitude is a useful compromise between being as northerly as possible (in order to accommodate the arctic islands) and staying reasonably close to the populated cities in the south. Second, using Flin Flon as the location where the inverse rotation will preserve the north-south direction as vertical should produce a map with a familiar shape. Flin Flon is just westerly enough to ensure a sizeable portion of southern border following the 49th parallel will be horizontal, and moving the origin any further westward would increase the distortion of the north direction in the east. The locations of the cities required to compute this projection (Flin Flon and the provincial capitals) can be retrieved from www.geonames.org using the geonames package (Rowlingson, 2014). A wrapper for the GNsearch function in mapmisc converts the data retrieved to a SpatialPointsDataFrame. Readers wishing to run the code below will need to register for a free username at www.geonames.org and record the username in R with options(geonamesUsername='yourusername'). ## ## ## ## library("geonames") provincialCapitals = mapmisc::GNsearch(q = "canada&featureCode=PPLA") provincialCapitals$name = provincialCapitals$toponymName flinflon = mapmisc::GNsearch(q = "Manitoba&name_equals=Flin flon&featureCode=PPL") The names of the provincial capitals and the location of Flin Flon are below. provincialCapitals$name ## [1] "Toronto" ## [5] "Regina" ## [9] "Yellowknife" ## [13] "Iqaluit" 1 Many "Edmonton" "Victoria" "Whitehorse" "Winnipeg" "Fredericton" "Halifax" "Québec" "St. John's" "Charlottetown" Canadians residing outside Toronto prefer the somewhat less affectionate nickname of ‘Hogtown’. The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 15 coordinates(flinflon) ## lng lat ## [1,] -102 54.8 The arguments given in the call to omerc below consist of: x giving flinfon as the origin of the projection; angle giving a sequence spanning the entire range of possible rotation angles; post = 'north' specifying that the planar coordinates should be inverse-rotated to preserve the north direction; and preserve supplying the locations of the provincial capitals other than Iqaluit (the northerly capital of Nunavut) for calculating the distances which the projections will seek to preserve. cproj = omerc(x = flinflon, angle = seq(-90, 90, by = 0.25), post = "north", preserve = provincialCapitals[provincialCapitals$name != "Iqaluit", ]) cproj ## CRS arguments: ## +proj=omerc +lat_0=54.768 +lonc=-101.865 +alpha=-83.75 ## +k=0.998 +x_0=0 +y_0=0 +gamma=-83.753 +ellps=WGS84 +units=m The projection above uses an origin (specified by lat_0 and longc) at the coordinates of Flin Flon and a cylinder tangent to the Earth along a Great Circle angled 83.75◦ anticlockwise (given by gamma). The k=0.998 component of the CRS specifies that the coordinates are multiplied by 0.998 thereby ensuring the Euclidean distances between the provincial capitals are, on average, correct. Although pairs of points along the Great Circle will have Euclidean distances and Great Circle distances being equal, the provincial capitals are some distance from this Great Circle and without scaling would have Euclidean distances which were on average 0.2% too large. The value of 0.2% is computed in omerc by calculating Euclidean distances in an unscaled Oblique Mercator as well as Great Circle distances obtained after reprojecting the cities to longitude-latitude. The alpha=−83.75 component results from having used the post='north' option, rotating the coordinates on the cylinder so that a vertical line passing through the origin contains points which are directly north or south of each other. Two cities on Figure 7 for whom coordinates have not yet been retreived are Kitimat, British Columbia, and the hamlet of Grise Fiord, Canada’s most northerly civilian settlement2 . These locations are obtained below. cities = rbind(provincialCapitals, mapmisc::GNsearch(q='Grise&featureCode=PPLL'), mapmisc::GNsearch(q='Kitimat&name_equals=Kitimat&featureCode=PPL'), flinflon) Also shown is the Great Circle which forms the y-axis of the Oblique Mercator. This circle is created using one of the many useful functions in the geosphere (Hijmans, 2014) package. gcircle = SpatialPoints( geosphere::greatCircleBearing(flinflon@coords, -83.75), proj4string=CRS("+init=epsg:4326")) ## Error in coordinates(coords): error in evaluating the argument 'obj' in selecting a method for ## Calls: :: ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous> The Canadian border shown on the figure is obtained with getData. canada = raster::getData("GADM", country = "CAN", level = 0) crsList = list( 'Oblique Mercator'=cproj, 'Lambert'=CRS("+init=epsg:3347"), 'Equidistant 2pt'=tpeqd(cities[cities$name %in% c("Edmonton","Toronto"),]), 'Web projection'=CRS("+init=epsg:3857")) canT = mapply(spTransform, CRSobj = crsList, MoreArgs = list(x = canada)) ## Error in mapply(spTransform, CRSobj = crsList, MoreArgs = list(x = gcircle)): object 'gcircle' The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 16 3 E F K F E K H F Q C V W S E F K H W R F Q C E K F S W R V F E K W R V F E F C Q S W R V F E K T H F Q S W R V F E K Halifax Char'town St. John's Fredericton Regina K T T V H R W F Q C E S F Kitimat W R V H T F C Q H F Q C V W R S E K Flin Flon H F Q C V R S T Victoria Toronto Kitimat Flin Flon T Québec H V F Q C W R S K F S Winnipeg −1 T K Edmonton pct over 1 2 0 R T T H C Q S T H F C T F E −2 T H Q F C K S E V W (c) Lambert (e) Equidistant 2pt , north Y W K F E R V G W Iqaluit T YI F S E K G C W R Q F H V T Whitehorse pct over (inverted) −1 −2 −3 −4 I W F K E G S R W C Q V F H S Q C F H T F E K Y I R W V S Q C F H T Grise Fiord Yellowknife Q C F T H St. John's T E R F W W Y G KI E F V R W 0 S Y C H F Q T E R W F C HI K F Q V Y Grise Fiord K VI S G W C H F Q E T R W F S C H G F W Q Y K T V E F R W S W 1 5 4 H W Y C K F T E F V Q R W K V I Whitehorse Grise Fiord Iqaluit Whitehorse Yellowknife St. John's (d) Oblique Mercator , north S C H F Q T F R W E Iqaluit C K H F Q E W T R V Yellowknife C K H F Q V T E R W G St. John's Y S C K F H E T V Q R W G I −1 W S C K F H T Q E V R W −1 W Y W Y S C K H F Q E W R V T pct over 2 3 I W Y S 0 I G S K F T E F V R W Q (b) Equidistant 2pt I 1 5 4 I G 1 pct over 2 3 G F K T E Q R V W H F C Q S W R V −5 (a) Oblique Mercator G C K F T E F V Q R W Halifax Q C E H F K S H F K T E C V R W Q E T F W H R Q F C S S Char'town C Q E H F T E K W H V Q F C S F St. John's K F E Q F T F K E H V R W F S C Fredericton F Q E F K T E K R F V H Q F C S Regina R W V T R H W F Q C S F K Victoria H R W F Q W R E F C K S H V Québec R W C pct over 0.5 1.0 S T V R W Winnipeg T V 0.0 T V V H Edmonton H E Q F C K V K Toronto R W S T −1.0 V Kitimat T S E Q S K F H C E H F C K Québec Edmonton Toronto −0.2 Q F H C K S E F Q F H K C S Flin Flon R W S Halifax V S S E F Q F H K C C K F H T F E Q V R W Char'town T W St. John's V S R R W Fredericton T T V Regina E Q F F S H K C T V Victoria W R R W 0 H 4 2.0 C T 1.5 V Winnipeg pct over 0.0 0.1 0.2 0.3 0.4 Figure 7: Canada in four map projections with a Great Circle ( · · · ). Background ©OpenStreetMap (f) Lambert , north Figure 8: Percentage by which Euclidean distance overestimates Great Circle distance between cities for different map projections The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 17 ## Error in points(gcircleT[[D]], cex = 0.1, col = "blue"): object 'gcircleT' not found Three projections in addition to the Oblique Mercator are shown in Figure 7: a Lambert Conformal Conic projection used by the Atlas of Canada; a two-point Equidistant projection based on the cities of Edmonton and Toronto; and the so-called ‘web projection’ used by Openstreetmap.org and other internet sites. A list is created containing these four projections using mapmisc’s tpeqd function in part. crsList = list( 'Oblique Mercator'=cproj, 'Lambert'=CRS("+init=epsg:3347"), 'Equidistant 2pt'=tpeqd(cities[cities$name %in% c("Edmonton","Toronto"),]), 'Web projection'=CRS("+init=epsg:3857")) The cities, Great Circle, and Canadian border are transformed to each of the projections. A list of the four projected Canadian borders is created with mapply. canT = mapply(spTransform, CRSobj = crsList, MoreArgs = list(x = canada)) Code for retrieving background maps and for producing Figure 7 appears in the Appendix. Notable features in Figure 7 the Web projection’s overemphasis of the high arctic regions, the downward slope of the Lambert projection, and the upward slope of the Two Point Equidistant map. The Oblique Mercator’s Great Circle, passing through Flin Flon, Halifax and Kitimat at a nearly horizontal −83.75◦ , is necessarily a straight line in the Oblique Mercator projection and appears straight in two other projections. There are three north arrows at the bottom of each map, and only the Web projection has the propertly that North is in the vertical direction throughout the map. An Equidistant projection where the north direction is preserved could have been obtained by selecting two nodes with the same latitide (i.e. Fredericton and Victoria). The choice of mismatched nodes (Edmonton and Toronto) is intended to demonstrate that unlike the Oblique Mercator, the Equidistant projection is not self-righting. Figure 8 compares Euclidean distance to Great Circle distance for three of the map projections, with the percentage by which the former overestimates the latter being shown for each pair of cities. Distances for the 12 most southerly locations are shown in Figures 8a-c, the y-axes are different and the horizontal lines are at -0.2, 0, and 0.2 for each plot. Figures 8d-f shows the distances from the five most northerly locations to all other points, with the scale for the Lambert projection being inverted. The aim of this Figure is not to establish the superiority of one projection over the others, they each have their merits and drawbacks, but rather to demonstrate that there are efficiencies to be gained through the use of customized projections. That said, the Oblique Mercator is the most consistently accurate, with the the exception of distances involving Grise Fiord. The optimisation which produced this projection deliberately excluded the arctic islands, so this lack of accuracy at the highest latitudes is not surprising. Distances involving Edmonton and Toronto are highly accurate with the Equidistant projection, which should be expected given those two cities are the projection’s nodes, but occasionally off by as much as 2%. The Lambert projection does not appear to be a CRS which statisticians should consider using for Canada when Euclidean distance is the primary concern. Conclusions This paper and the mapmisc package aim to contribute to and advance the suite of tools available to the growing community of R users performing advanced statistical analyses of spatial data. The first contribution made is the provision of tools which simplify the creation and use of colour scales, background maps and legends. The second contribution is the removal of all barriers to using a map projection which is appropriate for the problem at hand, even when a non-standard projection optimized for a particular study region is desired. The way in which mapmisc seeks to accomplish these goals is by automating many of the tasks involved, with obtaining and reprojecting map tiles or creating colour scales with transparency being two examples. A small amount of new functionality has been provided by mapmisc to R users, these being the creation of optimised map projections and the creation of inset maps and scale bars for data in a rotated projection. The scope of mapmisc has been kept deliberately narrow. Maps are static rather than interactive, base graphics are used, all spatial data types used are from sp or raster, and functions have been kept simple with few arguments. In contrast to spplot and ggplot2(Wickham, 2009), maps made mapmisc are produced with a sequence of independent function calls with each function performing 2 Readers should be aware that Grise Fiord resulted from a government-run resettlement program in the 1950’s, and involved the deception and neglect of the indegenous people who were relocated to the hamlet. The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 18 a very specific task. This approach was originally intended to benefit students and the non-specialist, and the package grew out of code originally provided to students in an undergraduate course. Additional motivations for the framework mapmisc employs are reproducibility and consistency when producing multiple maps. Colour scales and background maps are defined before a map is produced, and plot areas are laid out before any graphics are added. While sometimes clumsy for interactive use, mapmisc code fits tidily within Sweave or knitr documents and script files. Reproducability is an area where R has a clear advantage over many Geographical Information Systems, and spatial problems where an environment other than R is required are becoming fewer in number over time. Acknowledgements Attributions for background maps: Figures 2a, 7, 3 and all inset maps: ©OpenStreetMap contributors. Data by OpenStreetMap available under the Open Database License, cartography is licensed as CC BY-SA. Figure 2b, 4b: Tiles courtesy of MapQuest, portions courtesy NASA/JPL-Caltech and U.S. Depart. of Agriculture, Farm Service Agency. Figure 5: Map tiles by Stamen Design under CC BY 3.0. Data by OpenStreetMap available under the CC BY-SA Figure 6: Map tiles by CartoDB under CC BY 3.0. Data by OpenStreetMap available under the Open Database License Figure 9: the Appendix is ©OpenStreetMap contributors. Data by OpenStreetMap available under the Open Database License, cartography is licensed as CC BY-SA. with the exceptions below. mapquest : Tiles courtesy of MapQuest. Data by OpenStreetMap available under the Open Database License mapquest-sat : Tiles courtesy of MapQuest, portions courtesy NASA/JPL-Caltech and U.S. Depart. of Agriculture, Farm Service Agency. mapquest-labels : Tiles courtesy of MapQuest. Data by OpenStreetMap available under the Open Database License maptoolkit : ©Toursprung GmbH Data by OpenStreetMap available under the Open Database License humanitarian : ©OpenStreetMap contributors. Data by OpenStreetMap available under the Open Database License, cartography by Humanitarian OSM team is licensed as CC BYSA. cartodb : Map tiles by CartoDB under CC BY 3.0. Data by OpenStreetMap available under the Open Database License cartodb-dark : Map tiles by CartoDB under CC BY 3.0. Data by OpenStreetMap available under the Open Database License stamen-toner : Map tiles by Stamen Design under CC BY 3.0. Data by OpenStreetMap available under the Open Database License stamen-watercolor : Map tiles by Stamen Design under CC BY 3.0. Data by OpenStreetMap available under the CC BY-SA Bibliography R. Bivand. classInt: Choose Univariate Class Intervals, 2015. package=classInt. R package version 0.1-22. [p1] URL http://CRAN.R-project.org/ R. Bivand and C. Rundel. rgeos: Interface to Geometry Engine - Open Source (GEOS), 2014. URL http: //CRAN.R-project.org/package=rgeos. R package version 0.3-8. [p4] R. Bivand, T. Keitt, and B. Rowlingson. rgdal: Bindings for the Geospatial Data Abstraction Library, 2015. URL http://CRAN.R-project.org/package=rgdal. R package version 0.9-2. [p1] R. S. Bivand, E. Pebesma, and V. Gomez-Rubio. Applied spatial data analysis with R, Second edition. Springer, NY, 2013. URL http://www.asdar-book.org/. [p2] The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 19 P. E. Brown. Model-based geostatistics the easy way. Journal of Statistical Software, 63(12):1–24, 2015. URL http://www.jstatsoft.org/v63/i12/. [p3] R. J. Hijmans. geosphere: Spherical Trigonometry, 2014. URL http://CRAN.R-project.org/package= geosphere. R package version 1.3-11. [p15] R. J. Hijmans. raster: Geographic Data Analysis and Modeling, 2015. URL http://CRAN.R-project.org/ package=raster. R package version 2.3-33. [p1] F. Leisch. Sweave: Dynamic generation of statistical reports using literate data analysis. In W. H"ardle and B. R"onz, editors, Compstat 2002 — Proceedings in Computational Statistics, pages 575–580. Physica Verlag, Heidelberg, 2002. URL http://www.stat.uni-muenchen.de/~leisch/Sweave. ISBN 3-7908-1517-9. [p1] R. Munroe. Map projections. xkcd web comic, 2011. URL http://xkcd.com/977. [Online; accessed 11-March-2015]. [p3] E. Neuwirth. RColorBrewer: ColorBrewer Palettes, 2014. URL http://CRAN.R-project.org/package= RColorBrewer. R package version 1.1-2. [p1] E. J. Pebesma and R. S. Bivand. Classes and methods for spatial data in R. R News, 5(2):9–13, November 2005. URL http://CRAN.R-project.org/doc/Rnews/. [p1] B. Rowlingson. geonames: Interface to www.geonames.org web service, 2014. URL http://CRAN.Rproject.org/package=geonames. R package version 0.998. [p14] J. P. Snyder. Map projections – a working manual. Professional Paper 1395, US Geologic Survey, Washington, DC, 1987. URL http://pubs.usgs.gov/pp/1395/report.pdf. [p3] H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009. ISBN 978-0-38798140-6. URL http://had.co.nz/ggplot2/book. [p17] Wikipedia. Great-circle distance — Wikipedia, the free encyclopedia, 2015a. URL http:// en.wikipedia.org/w/index.php?title=Great-circle_distance&oldid=654813081. [Online; accessed 20-April-2015]. [p3] Wikipedia. Transverse mercator projection — Wikipedia, the free encyclopedia, 2015b. URL http://en.wikipedia.org/w/index.php?title=Transverse_Mercator_projection&oldid= 644679889. [Online; accessed 20-April-2015]. [p3] Y. Xie. knitr: A comprehensive tool for reproducible research in R. In V. Stodden, F. Leisch, and R. D. Peng, editors, Implementing Reproducible Computational Research. Chapman and Hall/CRC, 2014. URL http://www.crcpress.com/product/isbn/9781466561595. ISBN 978-1466561595. [p1] Patrick Brown Cancer Care Ontario 620 University Ave Toronto, ON M5G 2L7 Canada [email protected] Additional code European fertility URL’s for web sites where data are obtained: nutsUrl = "http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2010_60M_SH.zip" Download and read in boundary file: nutsFile = "euroNuts.zip" if (!file.exists(nutsFile)) { download.file(nutsUrl, destfile = nustFile) } unzip(nutsFile) euroNuts = shapefile(grep("RG_60M_2010.shp$", unzip("euroNuts.zip", list = TRUE)$Name, value = TRUE)) The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 20 (a) osm, ©OpenStreetMap (b) osm-no-labels, ©OpenStreetMap (c) osm-de, ©OpenStreetMap (d) osm-transport, ©OpenStreetMap (e) bw-mapnik, ©OpenStreetMap (f) mapquest, Tiles courtesy of MapQuest (g) mapquest-sat, Tiles courtesy of MapQuest (h) mapquest-labels, Tiles courtesy of MapQuest (i) landscape, ©OpenStreetMap (j) opentopomap, ©OpenStreetMap (k) maptoolkit, ©Toursprung GmbH (l) humanitarian, ©OpenStreetMap (m) cartodb, ©CartoDB (n) cartodb-dark, ©CartoDB (o) stamen-toner, ©Stamen Design (p) stamen-watercolor, ©Stamen Design Figure 9: openmap tile sets The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE col BrBG max 11 PiYG 11 PRGn 11 PuOr 11 RdBu 11 RdGy 11 RdYlBu 11 RdYlGn 11 Spectral 11 Accent 8 Dark2 8 Paired 12 Pastel1 9 Pastel2 8 Set1 9 Set2 8 Set3 12 Blues 9 BuGn 9 BuPu 9 GnBu 9 Greens 9 Greys 9 Oranges 9 OrRd 9 PuBu 9 PuBuGn 9 PuRd 9 Purples 9 RdPu 9 Reds 9 YlGn 9 YlGnBu 9 YlOrBr 9 YlOrRd 9 21 palette Table 2: Colour palettes from RColorBrewer, with col showing the character string to provide colourScale or brewer.pal and max giving the maximum number of colours for each palette. The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 22 The fertility data are retrieved as a gzipped tab-separated text file, which the R.utils package is able to decompress. The code below will download a copy of this file from the author’s web server if the Eurostat download fails. fertUrl = paste("http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing", "?file=data/demo_r_frate2.tsv.gz", sep = "") fertFile = "euro.tsv" fertFileGz = paste(fertFile, ".gz", sep = "") if (!file.exists(fertFile)) { tryDownload = try(download.file(fertUrl, destfile = fertFileGz)) if (class(tryDownload) == "try-error") { tryDownload = try(download.file("http://pbrown.ca/mapmisc/euro.tsv.gz", destfile = fertFileGz)) } if (class(tryDownload) == "try-error") { warning("cant find the fertility file") } R.utils::gunzip(fertFileGz, overwrite = file.exists(fertFile)) } euroDat = read.table(fertFile, header = TRUE, stringsAsFactors = FALSE, sep = "\t") euroDat = euroDat[grep("^TOTAL", euroDat[, 1]), ] euroDat$timegeo = gsub("^TOTAL,", "", euroDat[, 1]) Merge the fertility and polygon data. Warning messages that not all rows of the fertility table can be matched to polygons can be ignored. euroF = sp::merge(euroNuts, euroDat, all.x = FALSE, by.x = "NUTS_ID", by.y = "timegeo") Exclude some of the outlying parts of the EU: euroF = euroF[grep("^FR9[1-4]$|^PT[23]0$|^ES70$", euroF$NUTS_ID, invert = TRUE), ] Land categories Download and read in the land raster file: landUrl = "http://www.worldgrids.org/lib/exe/fetch.php?media=glcesa1a.tif.gz" gzfile = gsub("^http[[:print:]]+=", "", landUrl) tiffile = gsub("\\.gz$", "", gzfile) if (!file.exists(tiffile)) { download.file(landUrl, destfile = gzfile) R.utils::gunzip(gzfile, overwrite = file.exists(tiffile)) } land = raster(tiffile) Download and read in the table describing land categories: levelsUrl = "http://www.worldgrids.org/lib/exe/fetch.php?media=glcesa.txt" levelsFile = gsub("^http[[:print:]]+=", "", levelsUrl) if (!file.exists(levelsFile)) { download.file(levelsUrl, levelsFile) } landTable = read.table(levelsFile, header = TRUE, sep = "\t", stringsAsFactors = FALSE) Create a shorter labels and a table which colourScale can accept: landTable$shortLabel = gsub( "^Closed to open |\\(([[:digit:]]|\\-|\\%|>|<|m)+\\)| (-|/) [[:print:]]+$", "", landTable$NAME) landTable$shortLabel = trim( gsub( '(.{1,25})(\\s|$)', '\\1\n ', landTable$shortLabel) ) The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 C ONTRIBUTED RESEARCH ARTICLE 23 landTable = data.frame( ID= floor(landTable$MAXIMUM), label = landTable$shortLabel) Canada Transform the Great Circle and cities: gcircleT = mapply(spTransform, CRSobj = crsList, MoreArgs = list(x = gcircle)) citiesT = mapply(spTransform, CRSobj = crsList, MoreArgs = list(x = cities)) Obtain background maps: mapT = mapply(openmap, x = canT[c("Oblique Mercator", "Lambert")], MoreArgs = list(maxTiles = 12)) mapT[["Equidistant 2pt"]] = openmap(canT[["Equidistant 2pt"]], zoom = 2, extend = -5e+05) # add 1000km extra so the map goes as far as Greenland mapT[["Web projection"]] = openmap(canT[["Web projection"]], maxTiles = 18, extend = c(10^6, 0)) Create maps: for (D in names(canT)) { map.new(citiesT[[D]]) plot(mapT[[D]], add = TRUE) plot(canT[[D]], add = TRUE, border = col2html("black", 0.4)) points(gcircleT[[D]], cex = 0.1, col = "blue") points(citiesT[[D]], col = "red", pch = "+", cex = 1.5) text(citiesT[[D]], labels = citiesT[[D]]$name, col = "red", pos = citiesT[[D]]$pos, offset = 0.6, cex = 1.2) scaleBar(canT[[D]], "bottomleft", seg.len = 0, bty = "n") s1 = scaleBar(canT[[D]], "bottomright", seg.len = 0, bty = "n") scaleBar(canT[[D]], c(citiesT[[D]]@coords[citiesT[[D]]$name == "Flin Flon", 1], s1$centre[2]), seg.len = 0, bty = "n") scaleBar(canT[[D]], "topleft", seg.len = 1.3) } The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859