Creating an R Package

Transcription

Creating an R Package
Introduction
Create package
Submitting to CRAN
Conclusion
Creating an R Package
M. Quartagno1
1 Department of Medical Statistics
London School of Hygiene and Tropical Medicine
EMERGE Group meeting, 2015
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Terminology
Repositories
Terminology.
Package: extension of R base system with code, data and
documentation;
Source: Original version with human-readable text and
code
Binary : Compiled version with computer-readable text and
code, may work only on specific platform.
Library: A directory containing installed packages;
Repository: A website providing packages for installation.
(Github, CRAN...);
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Terminology
Repositories
Choice of the Repository
Assume we already have our functions and data we want
to share;
Which repository to use?
Github:
various types of software packages, not R specific;
Sharing with specific people, colleagues...
Less strict policies;
CRAN:
"Official" way to publish R package;
Make package available to everyone;
Need to respect CRAN policies;
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Package skeleton
Folders
Files
Create package.
Suppose we want to create package LSHTM.
Two ways:
1
Use package.skeleton() function;
Load all functions and data into clean R session;
Run: package.skeleton("LSHTM");
Some files and folders automatically created:
2
Create source (folders and files) manually;
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Package skeleton
Folders
Files
Source folders
data: contains all the datasets we want to include in the
package;
R: contains all R functions;
man: contain all help files for both datasets and R
functions (manual);
src: contains C/C++/Fortran uncompiled code. Optional;
inst: contains miscellaneous other stuff, e.g. citation
format for the package;
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Package skeleton
Folders
Files
man folder
Help files are in R documentation format (.Rd);
Latex-Style format:
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Package skeleton
Folders
Files
man folder
Help files are in R documentation format (.Rd);
Latex-Style format:
name: name of the package;
title: a short title, should be only one line;
description: this could be one or two paragraphs;
arguments: a short description of all the inputs of a
function;
details: a longer description of the algorithm used in the
function;
value: the output(s) returned by the function;
references;
examples: few examples. This should not run for more than
4 or 5 seconds;
package.skeleton() creates a skeleton file for all of these
voices for each function.
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Package skeleton
Folders
Files
DESCRIPTION file
DESCRIPTION: a mandatory file with brief description;
Package: jomo
Type: Package
Title: Multilevel Joint Modelling Multiple Imputation
Version: 0.1-2
Date: 2014-01-15
Author: Matteo Quartagno, James Carpenter
Maintainer: Matteo Quartagno <[email protected]>
Description: Building on Schafer’s package pan and on the standalone program REALCOM,
jomo is a package for multilevel joint modelling multiple imputation.
Binary and categorical variables are handled through latent normal variables and
algorithms for cluster-specific covariance matrices are introduced.
License: GPL-2
Suggests: BaBooN
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Package skeleton
Folders
Files
DESCRIPTION file
DESCRIPTION: a mandatory file with brief description;
Version: First number major changes, second number
minor changes, third number bugs fixed;
Maintainer: Use a valid email account, it is the only place
where you write it;
Description: One or two paragraphs
Depends: other R packages necessary in order for your
package to work;
Suggests: other R packages suggested but not strictly
necessary;
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Package skeleton
Folders
Files
NAMESPACE file
NAMESPACE: a mandatory file with objects you want to
import/export;
exportPattern(".")
useDynLib(jomo)
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Checking codes
Uploading the package
R CMD build
When finished preparing package, make sure you have
last version of R installed;
Download last version of R Devel;
Open command prompt and run R CMD build LSHTM;
This will create a tarball file that you can install locally or
send to someone else or upload on GitHub;
If you want to submit to CRAN, you still need to check that
you meet all the guidelines;
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Checking codes
Uploading the package
R CMD check
Run R CMD check LSHTM –as-cran ;
This will check carefully the whole package.
Help files are created automatically;
Examples are run and output is printed on a separate file;
A file with all Errors, Warnings and Notes is created;
If you want to submit to CRAN, not only all of the errors but
also all of the warnings and notes need to be fixed;
Remember to run it with the last version of R;
When everything is fine, take your time to read again the
whole CRAN Policies page...
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Checking codes
Uploading the package
Uploading the package
Upload the package on the CRAN repository:
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Checking codes
Uploading the package
Uploading the package
Upload the package on the CRAN repository;
The package is then to be checked by CRAN mantainers.
Be advised that they receive tenths of packages every day
and they work for free... They might not be in the best
mood when they reply...
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Checking codes
Uploading the package
Uploading the package
Upload the package on the CRAN repository;
The package is then to be checked by CRAN mantainers.
Be advised that they receive tenths of packages every day
and they work for free... They might not be in the best
mood when they reply...
>
> The maintainer confirms that he or she
> has read and agrees to the CRAN policies.
>
Please do also follow them.
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Checking codes
Uploading the package
Publication
You will have to resend your package through the same
procedure until everything is ok;
When your package is accepted, you will receive an email
confirming the publication of the source on CRAN;
Binary versions of the package are then published within a
couple of days;
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Aftermath
Aftermath
Start advertising your package;
If there were some bugs in the code, be ready to receive
thousands of emails...
You can theoretically submit a new version of the package
each month;
Before the package is established, which may take several
rounds, more submissions are accepted;
When new version of R is available, all packages are
tested;
The author is solely responsible for updating the package
in case updates caused some trouble to his/her package.
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Aftermath
R Journal
When your package is on-line and you are enough
confident, you can write a paper for the R Journal;
Impact factor 1, but continuously increasing in last years;
Remember that without outreach activity, your package will
slowly die...
Example: orcutt package for cochrane-orcutt estimation...
After 4 years, apparently no citations, almost never used...
Matteo Quartagno
Emerge
Introduction
Create package
Submitting to CRAN
Conclusion
Aftermath
R Journal
When your package is on-line and you are enough
confident, you can write a paper for the R Journal;
Impact factor 1, but continuously increasing in last years;
Remember that without outreach activity, your package will
slowly die...
Example: orcutt package for cochrane-orcutt estimation...
After 4 years, apparently no citations, almost never used...
Matteo Quartagno
Emerge
Appendix
For Further Reading
Aknowledgements and Bibliography I
This work was supported by funding from the European Community’s
Seventh Framework Programme FP7/2011: Marie Curie Initial Training
Network MEDIASRES ("Novel Statistical Methodology for
Diagnostic/Prognostic and Therapeutic Studies and Systematic Reviews";
www.mediasres-itn.eu) with the Grant Agreement Number 290025.
Matteo Quartagno
Emerge