softwaredeveloper ` squarterly

Transcription

softwaredeveloper ` squarterly
S O F T W A R E
D E V E L O P E R ’ S
Editor's Note............................................................................ 1
Recent Releases....................................................................... 1
Aerodynamic CFD Solver Using ParaView............................ 3
ITK Bar Camp: Growing the ITK Community......................... 4
High-Performance Computing Made Easy
with MoleQueue..................................................................... 6
New Infrastructure for Easy Multi-threading in ITKv4......... 9
A VTK Selection Algorithm Based on Digital Topology..... 11
High Quality Software Practices Build an Extensible
Medical Image Analysis Platform........................................ 13
Kitware News........................................................................ 16
The Kitware Source contains articles related to the development of Kitware products, software updates on recent
releases, Kitware news, and other content relevant to the
open-source community.
Q U A R T E R LY
Issue 24 • Jan 2013
PARAVIEW 3.98
The ParaView team has released version 3.98, the final
major release in the 3 series. These notes give a summary of
the major changes in this release, which address more than
300 issues.
Exporting to PS/PDF: One of the most requested features on
ParaView User Voice[1] was for support for vector graphics.
It is now possible to export scenes from 3D views and charts
as postscript, eps, pdf, or svg vector graphics. All text and
annotations are exported as vector graphics, ensuring crisp
reproduction for publications. However, 3D surfaces and
volumes must remain embedded as a rasterized image in the
resulting output file.
Math-text in annotations: The second most requested feature
was the ability to add mathematical markup to annotation
text. ParaView can now use the equation rendering support
in the matplotlib package to generate mathematical equations; only categorical scalar bar annotations support this
feature at the moment, but future releases will extend this
capability to all text fields.
Transparent surfaces: Users also wanted the ability to specify
opacity mapping when coloring surfaces with scalars, which
is included in this release. The 'Color Scale Editor' dialog
adds the ability to specify opacity functions for surfaces.
Improved AMR support: AMR data structures were redesigned for improved performance and memory efficiency.
It is now possible to volume render AMR datasets. We
also added support for multi-resolution streaming of AMR
datasets for adaptive volume rendering.
In this issue, Giles Richardson describes a computational
fluid dynamics solver using ParaView; Luis Ibáñez, Matthew
McCormick, and Xiaoxiao Liu introduce ITK Bar Camp;
Marcus Hanwell, David Lonie, and Chris Harris describe HPC
computing using MoleQueue; Matthew McCormick, Brian
Avants, Michael Stauffer, Baohua Wu, Nicholas Tustison, and
Arnaud Gelas describe a new infrastructure for multithreading in ITKv4; Philippe Pébay describes a new VTK selection
algorithm based on digital topology; and Stephen Aylward,
Jean-Christophe Fillion-Robin, Julien Finet, and Zach
Galbreath share how high-quality software practices led to
the advanced Slicer image analysis platform.
The Source is part of a suite of products and services offered
to assist developers in getting the most out of Kitware’s opensource tools. Project websites include links to free resources
such as mailing lists, documentation, tutorials, FAQs, and
Wikis. For more information or to learn more about Kitware,
please visit our website at www.kitware.com.
Volume rendering of AMR datasets in ParaView 3.98.0
Redesigned 'Properties' panels: The method for specifying
filter properties and display parameters has been revamped.
Instead of using separate panels, the new combined
'Properties' panel shows a subset of commonly-used filter
properties by default; users can easily switch to an advanced
view. The updated panel also adds the ability to search for
properties by name.
Simplified 'Find Data': The 'Find Data' dialog was simplified
with the ability to specify complex queries for advanced
users. Commonly used operations such as min, max, and
mean are now accessible from the simplified interface.
New slice-based views: Two new views have been added to
simplify data exploration using orthogonal slices. The Slice
view allows users to quickly create, delete, and move an
arbitrary slice along a given axis using an interactive user
interface. The Quad view enables users to explore a dataset
using three slices that are independently rendered in their
own 2D view while a 3D view lets the user see slices’ positions in 3D space.
Modularization of VTK/ParaView: VTK was restructured to
use a modularized approach for building various components, and ParaView's build infrastructure was revamped
to make use of this. Users may now build variants of the
ParaView library while choosing only modules of interest
from VTK. Thus, developers are able to build small, lightweight VTK and ParaView packages ideal for co-processing.
Improved co-processing: Catalyst [2] leverages the VTK and
ParaView modularization by allowing users to simplify which
parts of VTK and ParaView to link simulation codes to for
in-situ processing. Note that Catalyst should be considered
beta functionality. In addition, the co-processing library was
modified to add in extra Python methods that give a cleaner
look to the GUI-generated scripts.
Due to changes in ParaView, scripts created with ParaView
3.14.1 that generate screenshots will not work with 3.98.
ParaView 3.14.1 generated co-processing scripts that
only output extracts (i.e. data files) should work fine with
ParaView 3.98.
This version of ParaView also includes an experimental
interface to GPGPU processing in Los Alamos National
Lab's PISTON [3] library. The interface takes the form of a
plugin (which must be compiled from source) that exposes
Piston's on-GPU implementations of the slice, threshold and
iso-contour algorithms while minimizing transfers between
the CPU and GPU.
Every release of ParaView includes excellent contributions
from the community. The 3.98.0 release includes a completely redesigned 'Memory Inspector' panel for keeping
track of memory usage across all ParaView processes, including remote processes (thanks to Burlen Loring at Lawrence
Berkeley National Laboratory). A new ParaView reader
plugin for LANL's GMV file format has also been added
(thanks to Sven Buijssen at TU Dortmund University) to
support a wider set of keywords than the current one from
VisItBridge.
References
1.http://paraview.uservoice.com
2.http://catalyst.kitware.com)
3.http://viz.lanl.gov/projects/PISTON.html
2
ITK 4.3.0
The Insight Software Consortium is pleased to announce the
release of version 4.3.0, a major milestone that marks the
hard work of many community members.
This release includes the addition of experimental DICOM
image reading via the DCMTK library as a backend, in
addition to the GDCM library. By default, DCMTK ImageIO
support is not enabled. To try DCMTK ImageIO support, turn
the CMake option Module_ITKIODCMTK ON. For Unix platforms, the supporting DCMTK library will automatically be
built as a CMake ExternalProject. On Windows, the DCMTK
library must be built independently of the ITK build system.
Then, specify the location to the external build after setting
the CMake option ITK_USE_SYSTEM_DCMTK ON.
A number of registration-related feature enhancements
were added. The image registration methods have been
updated to accommodate multiple image metrics within a
single optimization scheme. This permits, for example, registration of a T1/T2 fixed image pair with a T1/T2 moving
image pair using a single metric for both T1 and T2 gradients, or even using two different metrics. New exponential
transforms are available along with automated B-Spline
transform scale estimation. A new physics-based non-rigid
registration class is also available.
ITK 4.3 also contains a number of important bug fixes,
including improvement to ObjectFactoryBase, Patch-Based
Denoising, support for a system libtiff, mesh processing, and
FFTW use. Support for instances of itk::VectorImage has been
added to many algorithms, and a single templated Adaptor
class can now be applied without modification to to both
itk::VectorImage instances and itk::Image instances of type
itk::Vector, itk::RGBPixel, etc.
In addition, an extensive amount of code was removed and
coding style was made more consistent. Performance optimizations were achieved by removing GetInput/GetOutput
calls within filter inner loops, improving memory alignment,
memory access pattern optimizations, and parallelization
through threading. It is also important to note that as
previously scheduled, Visual Studio 2005 will no longer be
supported after this release.
CMAKE 2.8.10
CMake 2.8.10 was released in late October and included
several notable changes, including added support for the
latest version of Visual Studio and Xcode, target properties
improvements, and updates to generator expressions.
In this 2.8.10 release, users now have a new way to arrange
exported targets that depend on other targets into "export
sets." In addition, there are new target properties for
PDB_OUTPUT_DIRECTORY and PDB_NAME implemented for
Visual Studio 7 and later.
Generator expressions, which are used to introduce conditional statements at generate time rather than at CMake
configure time, are now available in more contexts, notably
in the INCLUDE_DIRECTORIES and COMPILE_DEFINITIONS
target properties. There are also new generator expressions
available in the 2.8.10 release.
The file(DOWNLOAD) command can accommodate https
URLs. The pre-built binaries for this are available from
Kitware and link to OpenSSL. The team has also added
uniform compiler "id" and version number variables available for nearly all known compilers and platforms.
AERODYNAMIC CFD SOLVER USING
PARAVIEW
During the last year, a new computational fluid dynamics (cfd) software has been made available as freeware for
public use. The software named "ufo-cfd" is available from
https://sites.google.com/site/ufocfdsolver, and is loosely
described as a 'freeware aerodynamics CFD solver' (but is not
open-source). The software uses (and relies upon) ParaView
for pre and post-processing of the solver inputs and outputs.
The geometry (in STL format) is the main input to the CFD
solver, and the solver output is mainly a ParaView VTK file
and some convergence data. Like many, I first came to use
ParaView whilst using OpenFoam, and after that came to
use ParaView for other tasks such as STL file viewing and
manipulation.
AIMS
The aims of this new CFD freeware were driven by problems
with some of the existing CFD methods, namely cost, excessive time spent meshing, over complexity of solvers, and
accessibility (for a wide range of users). Having used snappyHex Mesh, I wanted something that could also robustly
handle complex STL geometry, but in a much more simplified framework. By simplified framework, I mean 1) get
geometry and solver, 2) edit input text file, 3) run solver,
and 4) visualise the solution using ParaView. When CFD is
made this easy-to-use, it becomes both more accessible
and applicable. One of the driving forces behind the new
software comes from impatience, meaning that I should be
able to obtain a geometry, setup the test-case and run the
solver within an hour or two; this is certainly true whatever
the geometry (whether it be a simple cylinder/sphere or an
F1 car). Obviously the solver time depends on other factors
such as mesh size and available brute force (number of CPU
and RAM). That said, for simple test cases it is feasible to
complete the CFD cycle (geometry to flow solution) within
an hour, when familiar with the format of the simple input
text file.
RAM / 32/64BIT / PARALLEL ISSUES / PLATFORM
Another of the driving forces was to reduce the RAM dependency of the solver. The solver is streamlined in terms of
the arrays which are stored; no additional arrays are stored
unless they have to be stored, and all arrays are allocated.
This allows the user to specify small arrays for small problems, and vice-versa. I initially found that the limit of the
32-bit solver was about 10 million cells. Then after the 64-bit
executable was available, the max mesh size depends how
much RAM you have available on your machine/cluster. I
then did a 20 million cell mesh on a F1 car and found
that this used only 3GB RAM. So then the limiting factor
becomes the number of processors that you have available.
The version of gfortran that I eventually used was from
3
(http://www.equation.com), which fortunately allowed
64-bit compilation using the OpenMP flag. This was a leap
forward for the CFD software since (with minor modifications to the source code) the 64-bit parallel solver became
easily achievable. The F1 car demo that I started ran at
approx 3 iterations per minute on 8 processors, which means
a weekend is required for 10,000 iterations of a 20 million
cell mesh. But that's not at all bad for freeware on laptop
at home. Now the software compiles in a 32-bit Linux mode
also, although that version is largely untested.
METHOD DEVELOPMENT
The solver code was developed over about a year, using
gfortran throughout. The code was written 'back-to-front'
in some ways. The first part of the code to be completed
was the writing of the VTK solution; a simple 100x100x100
(1 million) cell mesh was written out and then read into
ParaView, and that was the starting point. The next step
of reading in an STL file was straightforward, and then I
used the STL data to mark the grid with 'cut-cells' where
the STL triangles intersect the mesh cells. The next part was
to write the fill algorithm to mark the inside cells as 'dead,'
the outer ones as 'live,' and the cut-cells as 'cut'. Now the
fill routine marks millions of cells in few seconds, and that is
the only computational effort required for meshing. I then
had the foundation on which to write a solver which loops
though the arrays in conventional i,j,k loops, but skipping
the dead cells. The solver was then written based on my PhD
notes (from 1995) on the Navier Stokes equations, but using
zero velocity at the walls (cut-cells). This highlighted issues
of solution stability, and some smoothing parameters were
introduced to smooth the solution. The other tricky part of
the development was to apply a non-zero tangential velocity
at the wall, which then allows the solver to apply an Euler or
wall-function type approach at the wall. All smoothing and
wall treatment parameters are adjusted via the input data
file so to give the user maximum control over the method.
ParaView is an integral part of the CFD solver. The only
part of the process which doesn't use ParaView, is during
the solver execution, where Gnuplot is used to monitor the
solver convergence.
Geometry: Having obtained an STL file of interest, the user
reads this into ParaView and changes the scale / orientation
of the geometry to suit the solver defaults (xmin: inlet, xmax:
outlet etc). After this, the STL file is saved in ParaView (after
surface extraction) in ASCII format. At this stage, ParaView
is also used to note down the STL geometry limits in x/y/z
directions.
Meshing: The mesh setup requires minimal effort, but
ParaView is essential because it allows users to load in and
visualise both the STL file and initial guessed mesh at the
same time. The user may then go through several iterations
of adjusting the mesh size and location relative to the STL
geometry. Then, ParaView is used to check that there is a
'dead' region inside the geometry, meaning that the cutcells form a continuous 'air-tight' surface. This is done by
viewing the cell_lcd property within the VTK file.
Visualisation: Once the flow solution is obtained, the VTK
file is loaded into ParaView for post-processing. The cell-topoint-data filter is used. A variety of ParaView methods are
typically used to view the flow-field, including contour plots,
velocity vector plots, and streamlines. The ParaView software
allows for fast manipulation of solutions containing millions
of cells, thus enabling fast flow visualisation.
Shrink-wrap: The ParaView method can also be used to
obtain a shrink-wrapped representation of the original STL
geometry. Once the VTK file is loaded from the solver, you
can create a contour surface of the 'cut-cells,' which looks
like a brick-work representation of the real geometry. Then,
by applying some smoothing to this surface, you get something similar to the original STL geometry, which can be used
to display a surface property such as pressure or velocity. This
is very useful for the CFD solver since it allows you to display
the flow solution on the body surface.
APPLICATIONS
These days a large collection of geometry is available
to anyone with internet access. For example, 'Sketchup
Warehouse'
(http://sketchup.google.com/3dwarehouse)
allows users to select from a massive collection of vehicles
(planes / helicopters / cars). The geometry may not be clean
or 'watertight,' but it does not need to be for this solver. It
is preferable to have a dead region inside your geometry,
but in some cases (with no dead region) the interior region
of the model is also live, and the solver runs just as well, if a
little slower. The CFD solver can be applied to a wide variety
of applications (aerospace, automotive, etc), but admittedly
there are some limitations to the approach: 1) Its a compressible solver which strictly means that it is not applicable to low
speed flows; 2) The Cartesian mesh approach means that the
mesh size is uniform within the geometry region (expanding
away from geometry). This means that the method is best
suited to external flows around cars or aircraft where the
entire body can be captured by a uniform mesh size. For
example, for the F1 car, the entire geometry was modelled
using 1cm cuboid cells; but for other 'long' geometries it
would be advisable to use long cells such as 2x1x1 or 3x1x1
cells, which is a simple modification of the input file.
SUMMARY
This CFD solver has been developed as a personal interest
and is not associated with any current or previous employers. It is hoped that the software would get significant
use in academia (for under-grad or post-grad coursework)
as well as for commercial or industrial applications. The
direction of the software development is uncertain for
now, and will mainly depend on any future interest or
enquiries which are received via the website contact email:
[email protected]
ITK BAR CAMP: GROWING THE ITK
COMMUNITY
The Linux Foundation released its annual report on “Who
Writes Linux” in April of this year. It reveals interesting facts
about how such a large-scale open-source community operates.
Some of the Linux Kernel facts that jump out of the report
include:
Releases are done every 80 days on average. Every release
includes an average of 10,000 patches; that’s an average of
5.6 patches per hour. About 1,200 developers contribute to
any given release.
The number of lines of code increases about 1.1 million per
year, which averages out to 136 lines of code per hour. There
have been about 10,000 developers involved for the full
20 year history of the project. The most active contributor
made only 1.2% of the changes, and the 20th most active
contributor made 0.6% of the changes; this shows how flat
the distribution of contributions is.
Out of all the changes made, 50% of them were distributed
across the top four organizations. The largest number of contributions came from non-affiliated developers (18%). Very
experienced developers are focused on merging changes,
such as Greg Kroah-Hartman who has signed of 5.8% of line
changes, and Linus Torvalds who has merged 2.4% of line
changes (being the 4th ranked signer)
What all this illustrates is the behavior of a fully grown and
properly provisioned open source project. In particular:
• A large number of contributors.
• A high ratio of contributors to number of lines of code.
• A rather flat distribution of labor.
• No large dependency on any given developer.
• All of them are loved and appreciated, but none of them
is indispensable.
The Linux Kernel is not the only project to have flourished to
that level. Similar scale projects include, for example:
ACKNOWLEDGEMENTS
http://www.ParaView.org (ParaView)
http://sketchup.google.com/3dwarehouse/ (STL geometry)
Method
Number of Commits
Number of Developers
Gnome
51,813
1,053
Chromium
47,607
911
KDE
62,323
776
WebKit
31,873
304
http://www.equation.com (gfortran)
http://www.gnuplot.info (Gnuplot)
Giles Richardson is a graduate of Leicester
University (BEng Hons) and Cranfield
University College of Aeronautics (PhD). He
has worked at UK aerospace companies
including Westland Helicopters Ltd (Yeovil,
UK) and Rolls-Royce (Bristol and Derby, UK).
He is currently employed as a Cooling
Systems Analyst at Perkins Engines Ltd,
(Peterborough, UK) which is part of Caterpillar Inc.
4
When looking back at the Linux Kernel, a rule of thumb that
emerges is that it has about:
1 developer per
1,000 lines of code
These are, of course, not full time developers on average,
but rather those whose commitments follow the typical
power-log distribution of communities where participation
is open and volume is only regulated by level of interest and
the availability of contributor time. Such distribution is
explained by Yochai Benkler in his article “Coase’s Penguin”.
With this context in mind, we have looked back at the Insight
Toolkit (ITK) project and realized that the community is vastly
underpowered. The size of ITK is about 1.1 Million lines of
code (LOC), out of which 655 K lines of code are from third
party libraries such as PNG, TIFF, JPEG, GDCM, and HDF5. This
leaves us with 468 KLOC of native ITK code. If we apply
the rule of thumb on the ratio of the number of developers
compared to number of LOC from the Linux Kernel to ITK,
we find that our community should have about 468 active
contributors. The statistics of the Git repository, however,
reveal that in the full history of the project, we have 162
contributors; out of that number, only 74 have contributed
during the two years of the ITKv4 refactoring effort.
From this realization, we have launched a new initiative to
grow the ITK community to the size that matches the complexity of the toolkit. Our estimation above indicates that we
should grow up to 500 developers, as always, following that
power log distribution where 20% of developers do 80% of
the work, and where there is a long tail of many developers
who will take care of the 20% of work remaining.
To get there, we are pursuing two major initiatives: Intensive
Training, and Engagement and Retention
The natural place to start the training and recruiting process
is the large community of what we used to call “users,”
but that now we more respectfully refer to as “community
members.”
Based on the 2,200 subscribers to the mailing list and the
3,200 monthly downloads of ITK released packages from
Sourceforge, We estimate that about 5,000 people are using
ITK worldwide. We cannot establish an exact number of ITK
adopters due to our adherence to the practices of allowing the free flow of software downloads without requiring
registration. In other words, we refrain from tracking
downloads and getting in your way when you are
downloading the toolkit.
From these numbers, our mission is to engage 10% of these
community members, and to bring them to become active
participants in the development and maintenance of the
toolkit. That 10% will correspond to the 500 maintainers
who can take care of all the needs of the toolkit, from bug
reporting, bug fixing, documentation, training, support
in the mailing list, and development of new features and
improvements.
To provide the grounds for training we’ve launched the
ITK BarCamp initiative. A BarCamp is an open space for
collaboration, with particular emphasis on education and
improvement of skills.
5
Being born in the Information Age, the ITK BarCamp is taking
advantage of the most popular online sites to facilitate the
outreach to the larger ITK community members, wherever
they are.
The ITK BarCamp has a Google+ page [1] and regular hangouts are organized to bring community members to work
together on training and development activities. These
hangouts are publicly open, and are recorded for future
viewing by those who may have missed the occasion. The
ITK BarCamp is also one of the first organizations to have a
G+ Community page [2].
ITK BarCamp also maintains YouTube channel [3] where we
post short video tutorials and the recordings on hangout
activities. The collection of short video tutorials is following
the approach of Kahn Academy: building up a body of five
to fifteen minutes videos that can be watched in any order.
They cover a large variety of topics related to the software
development skills needed to become a master ITK contributor.
These short videos are accompanied by text instructions, and
are linked from the documentation page [4], which is generated from RST files that are processed by Sphinx to generate
the final HTML pages. The sources of these RST files and
associated Sphinx configuration are publicly available in the
Github repository [5].
Please join us in this initiative to grow the ITK community
in the number of contributors, level of programming skills,
and spirit of collaboration. Your suggestions are greatly
appreciated.
REFERENCES
[1] https://plus.google.com/u/0/106512397331641956186/posts
[2] https://plus.google.com/u/0/
communities/111375098792764998322
[3] http://www.youtube.com/user/ITKBarCamp
[4] http://insightsoftwareconsortium.github.com/
ITKBarCamp-doc/host.html
[5] https://github.com/InsightSoftwareConsortium/
ITKBarCamp
Luis Ibáñez is a Technical Leader at Kitware,
Inc. He is one of the main developers of the
Insight Toolkit (ITK). Luis is a strong supporter of Open Access publishing and the
verification of reproducibility in scientific
publications.
Matt McCormick is a medical imaging
researcher working at Kitware, Inc. His
research interests include medical image
registration and ultrasound imaging. Matt is
an active member of scientific open source
software efforts such as the InsightToolkit,
TubeTK, and scientific Python communities.
Xiaoxiao Liu is an R&D Engineer at Kitware.
Her research interests are in medical image
analysis and applications, including statistical shape analysis for anatomical structures,
deformable shape modeling and segmentation, diffeomorphic image registration
techniques and image-guided radiotherapy.
HIGH-PERFORMANCE COMPUTING
MADE EASY WITH MOLEQUEUE
One of the goals of the Open Chemistry project [1,2] at
Kitware is to provide a simple, easy-to-use interface for
submitting chemical simulations to be executed on highperformance computing (HPC) resources. To this end, we
have developed MoleQueue -- a system-tray resident server
application that uses standard inter-process communication channels to interact with programs that generate and
analyze simulation data. It enables applications such as the
Avogadro molecular editor to easily interact with remote
computing resources to perform calculations that are not
feasible on a typical workstation. MoleQueue’s functionality
is not limited to chemical simulations; it provides a generic
interface to a variety of HPC resources suitable for use by any
number of application domains. MoleQueue’s functionality
is currently being applied to diverse fields at Kitware, such
as large scale climate[3] and fluid dynamics[4] simulations.
In order to appreciate how an application integrated
with MoleQueue can simplify an HPC user’s workflow, the
figure below shows the typical workflow when using a HPC
resource. Here the simulation component leads to job submissions, with a possible informatics/data storage step, and
then results are generated on the nodes of the resource and
brought back into a package for further analysis.
Results
Simulation
Informatics
This workflow can be quite daunting for new researchers
due to the steep learning curve associated with many of the
technologies. A user must learn how to use and configure
SSH and SFTP clients, interact with the cluster’s scheduler, set
up the execution environment for each queue and program,
and navigate and manipulate files from a command prompt.
As we’ll see in the next section, MoleQueue automates most
of this process, allowing the user to concentrate on their
domain-specific research, rather than worrying about the
details of job scheduling.
THE MOLEQUEUE WORKFLOW
TRADITIONAL HPC WORKFLOW
Log File
If one wants to monitor the job then it is necessary to log
back into the cluster, execute commands to obtain a listing
of all jobs in the scheduler, and locate their job in the output;
or to configure the scheduler to email status updates to the
submitter. Once a job has completed, the output files would
normally be manually copied back to the local workstation
using SFTP/SCP for further analysis. There is also the added
complication of figuring out how to accomplish these tasks
locally for small test runs, or on cloud resources for larger,
on-demand submissions.
Input File
Using MoleQueue to perform a simulation consists of two
stages: a one-time set up of the queues and programs they
can access, and the submission of specific calculations.
ONE-TIME SET UP
Import Preset Configurations
The one-time set up of MoleQueue consists of configuring
cluster login details, scheduler interactions, and program
execution environments. Fortunately for non-technical users,
MoleQueue provides a method for importing preset configurations. This feature enables site maintainers and research
groups to provide users with an appropriate configuration
file. In this case, setup will consist of simply importing the
file through the MoleQueue user interface.
Job Submission
HPC integration
Local
Cloud
Supercomputer
Initially, the researcher generates program-specific input files
on their local machine, either using a dedicated application
or by formatting and configuring the files manually with a
text editor. The input files are then copied to the remote
HPC cluster using a file transfer protocol, such as SFTP or
SCP. Next, the user runs an SSH client to log into the remote
system and obtain a command prompt, which is typically a
shell on the head node of a Linux cluster. The user navigates
to the simulation’s working directory using shell commands,
and creates a submission script to configure the execution
environment for the calculation. This submission script is
sent to the cluster’s job scheduler, and the calculation begins
as soon as the requested resources become available.
6
User specific settings such as login names and working directories would still need to be set, but the bulk of the technical
details concerning scheduler interaction and program execution will be configured by the importer.
Manual Configuration
More advanced users (or those with less generous system
administrators) can configure resources themselves using the
MoleQueue application as detailed in the following sections.
Adding a Local Queue
A local queue for performing calculations on a user’s workstation can be created by opening the Queue Manager in
MoleQueue, clicking “Add,” and selecting the “Local”
queue type.
The connection to the remote host is configured by setting
the hostname or IP address of the cluster’s head node
(somehost.facility.edu in this example) and the name of
the username that will be used during login (“user” in the
above example). The “Test Connection” button will attempt
an SSH login to the configured host, allowing for connection
troubleshooting if necessary.
Configuring a local queue is simple -- all it needs to know
are the number of processor cores the user wishes to use
for calculations. MoleQueue will automatically detect the
number of available cores and use this as the default value.
Submitted jobs will be copied to and submitted from the
“Remote Working Directory” (/work/user above). “Submit
Test Job” can be used to send a trivial job to the configured
queue, enabling users to test their configuration.
Adding Programs to a Queue
Program execution environments are fully configurable.
Several presets for common execution syntaxes are available
for simple programs, or the entire batch script template can
be customized for more complex simulations. This allows
programs to make use of advanced resources such as a
specific MPI implementation for multi-node parallelism, configuration of environment variables, etc.
Adding a Remote Queue
Queues on remote HPC clusters are added by selecting the
type of scheduler running on the cluster. The Portable Batch
System (PBS) and Sun Grid Engine (SGE) schedulers are currently supported, along with their descendants (i.e. Torque
(PBS-like)) and OpenGrid (SGE-like)). The setup for each of
these is similar, so we’ll use the PBS/Torque configuration as
an example.
JOB SUBMISSION WITH A MOLEQUEUE CLIENT
Once the initial setup of queues and programs is finished,
performing a simulation using software that is integrated
with MoleQueue is simple: prepare the simulation, select
the target queue and program, and click submit in the client
application. MoleQueue takes over at this point by copying
the input files to the server, scheduling the job, and monitoring the remote queue until the job is complete. When the
calculation finishes, MoleQueue will copy the output files
back from server and notify the client that the job is complete. The user doesn’t need to use SSH directly, learn shell
commands, or interact with the scheduler.
The remote queue’s configuration is initially set to reasonable default values. The status of running and queued jobs
will be queried every three minutes; the standard qsub,
qdel, and qstat commands will be used to interact with the
scheduler, and the batch script will be written to job.pbs. A
fully customizable batch script template is provided, using
keywords such as $$numberOfCores$$ and $$maxWallTime$$ which will be replaced with job-specific options, and
the $$programExecution$$ keyword which is replaced by
program-specific execution details.
7
If the calculation finishes before the client software has
been closed, the output file can be opened and analyzed, or
used as a starting point for a new calculation. This approach
enables client software to implement fast, efficient workflows for performing complex simulations. Alternatively,
more complex calculations that require weeks or months to
complete can be started from the client software and will
continue to be monitored by the MoleQueue server application until complete. The application maintains state between
sessions, so stopping and restarting the server program will
not affect job monitoring. When a job completes, the output
can be opened in an appropriate application directly from
the job listing in MoleQueue.
A slightly more complex RPC call to submit a job using
MoleQueue would look as follows:
{
IMPLEMENTATION
MoleQueue is an open-source, cross-platform C++ Qt application developed to provide an abstraction to local and
remote computational resources. It consists of two primary
components: a system-tray resident application that acts as
a job dispatch server, and a small client library that provides
an interface to the remote procedure call (RPC) API, which
interacts with the server component. In addition to handling client requests, the server manages a local job queue
where calculations can be scheduled for execution on the
local workstation, and also directs communication and data
exchange with remote HPC resources. Two client libraries are
provided with MoleQueue: a C++ library extending Qt, and
a Python module.
"jsonrpc" : "2.0",
"method" : "submitJob",
"params" : {
"queue" : "Tritium",
"program" : "GAMESS",
"description" : "B3LYP H2O optimization",
"inputFile" : {
"filename" : "job.inp",
"contents" : "Full contents of input file.\n
Will be created in the working tree."
}
},
"id" : 23
}
This submits a job to the remote queue named Tritium, with
the program named GAMESS. The description is the string
that will show up in the MoleQueue user interface, and the
input file is specified by either a full path to an existing file
or filename and content strings. The response for a successful submission looks something like the following:
{
MOLEQUEUE CLIENT-SERVER COMMUNICATION
The messages transmitted between the client and server
are formatted using JavaScript Object Notation (JSON) and
adhere to the JSON-RPC 2.0 specification. The JSON format
was chosen due to the vast array of implementations in virtually every programming language; JSON-RPC 2.0 builds upon
the JSON data format in order to provide a cross-platform,
device-independent RPC API that can easily be implemented
in any language desired. Exchange of messages occurs
over standard inter-process communication channels: local
sockets on Unix-like platforms and named pipes on Windows.
Optional support for the ZeroMQ message passing interface
is also supported (where available).
The JSON-RPC API is documented online[5], and the simple
exchange below exemplifies the simplicity of the format. A
client may query the available programs and queues on the
server by sending a message such as:
{
}
"jsonrpc" : "2.0",
"method" : "listQueues",
"id" : 42
The server’s response to this request will contain a key-value
map, with the names of the available queues (Gold, Tritium,
and Local in this case) as the keys, and lists of available programs as the values:
{
}
"jsonrpc" : "2.0",
"result" : {
"Gold" :
[ “GAMESS", "MOPAC", "Gaussian", "NWChem" ],
"Tritium" :
[ “GAMESS", "MOPAC", "Gaussian", "NWChem" ],
"Local" :
[ "GAMESS", "MOPAC", "Gaussian", "NWChem" ]
},
"id" : 42
8
}
"jsonrpc": "2.0",
"result": {
"moleQueueId": 17,
"workingDirectory":
"/home/user/.molequeue/jobs/17/"
},
"id": 23
This response object gives a long lived identifier for the job,
moleQueueId, and the working directory where all of the
files will be staged. Once a job is submitted, notifications
are sent when the job state changes; for example, from
submitted to running, error, completed, etc. Each of the
notifications carries the moleQueueId of the job and the
previous and current states. It is then possible for the client
to act upon these changes to, for example, open output files
once the job has finished. There are also RPC methods to
query job status, or to cancel an already submitted job.
SUBMITTING JOBS USING THE C++ CLIENT
When considering adding MoleQueue functionality to an
existing Qt-based C++ application, the C++ MoleQueue
client library is a great choice. It takes care of generating
the JSON-RPC 2.0 calls for you, and will emit signals when
responses or notifications are received from the server.
The following code shows the basic process of submitting a
job from a C++ application.
#include <molequeue/client/client.h>
#include <molequeue/client/job.h>
// Create the client
MoleQueue::Client client;
// Create a job to submit.
MoleQueue::JobObject job;
job.setQueue(“Tritium”);
job.setProgram(“GAMESS”);
job.setInputFile(“/path/to/job.inp”);
// Connect to the correct signal, the slot
// is called jobResponse.
connect(&client,
SIGNAL(submitJobResponse(int, uint)),
this, SLOT(jobResponse(int, uint)));
// Submit the job to the queue, the job submission
// gives the local ID, the signal gives the local
// ID (int) and the moleQueueId (uint).
int localId = client.submitJob(job);
The slot can match up the local ID to the returned
MoleQueue ID, and all future queries or notifications will
use the MoleQueue ID. Other actions have corresponding
signals that can be used, and the C++ API is asynchronous.
SUBMITTING JOBS USING THE PYTHON CLIENT
In order to submit a job with MoleQueue, application code
must be written that connects to MoleQueue and makes the
appropriate remote calls. One of the easiest way to do this
is with the Python API, which allows job submissions from
simple Python scripts.
The following code snippet shows the basic process of submitting a job in Python:
# create molequeue client
client = molequeue.Client()
# connect to server
client.connect_to_server('MoleQueue')
# create Job
job = molequeue.Job()
# set the queue that the job will be submitted to
job.queue = 'Tritium'
# set the program to run
job.program = 'GAMESS'
file_path = molequeue.FilePath()
file_path.path = "/path/to/job.inp"
# Set the input file path
job.input_file = file_path
# now submit the job to the server. This methods
# will return the MoleQueue ID that can in used
# in other method. For example to cancel the job.
molequeue_id = client.submit_job(job)
print "MoleQueue ID: ", molequeue_id
# finally disconnect from server so resources can
# be cleaned up.
client.disconnect()
CONCLUSIONS
The MoleQueue application can be used to interact with
local and remote computational resources and provides a
user-friendly configuration interface. Client libraries allow
application developers to integrate HPC functionality with
their software. This provides a rich and compelling user
experience by lowering the barriers faced by new researchers and saving time previously spent configuring simulation
environments. Support for several common job schedulers
is available, and the application framework is available on
Windows, Mac, and Linux. Client applications may interact
with the MoleQueue server from any language using standard communication protocols and data exchange formats.
REFERENCES
[1] http://www.openchemistry.org
[2] http://www.kitware.com/source/home/post/39
[3] http://uv-cdat.llnl.gov
[4] "Computational Model Builder (CMB): A Cross-Platform
Suite of Tools for Model Creation and Setup," Hines, A.,
et al. DoD High Performance Computing Modernization
Program Users Group Conference, 2009.
[5] http://wiki.openchemistry.org/MoleQueue_JSON-RPC_
Specification
9
Marcus Hanwell is Technical Leader in the
scientific computinig team at Kitware, where
he leads the Open Chemistry effort. He has
a background in open source, open science,
Physics, and Chemistry. In addition to leading
the Chemistry team at Kitware, he is also the
lead developer of Avogadro.
David Lonie is an R&D Engineer on the scientific computing team at Kitware. He has
been active in the open-source chemistry
community since 2009, developing for
various projects such as the Avogadro editor
and the Open Babel toolkit.
Chris Harris is an R&D Engineer at Kitware.
His background includes middleware
development at IBM, and working on highlyspecialized, high performance, mission
critical systems.
NEW INFRASTRUCTURE FOR EASY
MULTI-THREADING IN ITKV4
PREVIOUS INFRASTRUCTURE
In the context of multi-threading, image processing is often
considered an “embarrassing parallel” problem. That is,
since the operations to create one output pixel are often
independent from other output pixels, data-parallelism can
be achieved when output pixels are partitioned into subdomains to be processed in each thread.
Infrastructure
to
easily
write
multi-threaded
itk::ImageToImage filters has been available for a long
time, and the majority of image processing code in ITK can
take advantage of parallel computing architectures. While
embarrassing parallel problems may make the design of a
parallel algorithm obvious, multi-threaded code is often not
realized until there exists sufficiently high-level abstractions
that remove the need to learn low-level details, write verbose
boilerplate code, and debug the results. Recently, APIs such
as OpenMP provide these abstractions in a cross-platform
way. ITK also has a cross-platform API for the creation of
multi-threaded code, but its design is more appropriate for
image analysis in a C++, generic programming context than
OpenMP.
To write multi-threaded code with the traditional
itk::ImageSource API, a filter author simply needs to overload select virtual protected methods:
• BeforeThreadedGenerateData()
• ThreadedGenerateData (const OutputImageRegionType
&outputRegionForThread, ThreadIdType threadId)
• AfterThreadedGenerateData()
The ImageSource methods BeforeThreadedGenerateData
and AfterThreadedGenerateData are optional, singlethreaded methods to prepare for and respond to the
threaded operation. These methods are used to pre-process
inputs or handle thread-local storage, for example. After
the ITK multi-threading infrastructure spawns threads in a
cross-platform way, the ThreadedGenerateData method will
be called in each thread.
In the ITKv4 effort, additional infrastructure was added that
improves the multi-threaded operation flexibility within
image filters, and also makes it possible to easily write
multi-threaded code outside of itk::ImageToImageFilters.
This infrastructure is used extensively in the v4 registration
metrics and v4 level set evolution code.
IMPROVED METHOD FLEXIBILITY
While the traditional BeforeThreadedGenerateData/
ThreadedGenerateData/AfterThreadedGenerateData infrastructure covers the majority of image filtering algorithms,
not all filters conform to this model. For example, an image
filtering algorithm that is organized into a single unit should
be logically organized into a single C++ class in its source
code, but it may contain multiple parallelizable operations
that must be separated by serial operations. In the past,
this situation required the use of thread synchronization
classes or resorting to the use of verbose, low-level threading code. In practice, this meant that multi-threading was
not achieved in these cases. This is evident in the table
below where the number of classes in the ITK 4.3 Filtering
Group that use the low-level SingleMethodExecute call are
compared to the number of classes that use the high-level
ThreadedGenerateData.
Method
ITK Filtering classes using given method
SingleMethodExecute
5
ThreadedGenerateData
112
Code for reproduction:
cd Modules/Filtering
git grep -l SingleMethodExecute\(\) | wc -l
git grep -l ::ThreadedGenerateData | wc -l
Another limitation is that ITK is a registration and segmentation library in addition to image an image filtering library,
and most registration and segmentation classes do not
inherit from itk::ImageSource, so ThreadedGenerateData is
not available.
In the new ITKv4 infrastructure, high-level multi-threaded
operations are written not by overloading a particular
virtual method (ThreadedGenerateData), but by adding a
new member class that inherits from itk::DomainThreader.
This has a number of advantages. High-level, multithreaded operations can be applied in any class, not just
those that inherit from itk::ImageSource. An algorithm that
has multiple, multi-threaded operations can be organized
into a single C++ class by simply adding more than one
itk::DomainThreader member. Within a GenerateData call,
the class members can be called in any order, repeatedly or
conditionally, as necessary. All the state variables related to
the threading operation will be appropriately encapsulated
as members of the itk::DomainThreader derived class.
The itk::DomainThreader class is templated over the type of
data domain it is going to partition (described below) and
the type of the class that it will be performing the threaded
operation for, known as the “Associate” class. When an
Associate class declares an itk::DomainThreader member,
10
it typically uses the C++ “friend” keyword so the Associate
class has access to its protected and private members.
Similar to the traditional BeforeThreadedGenerateData/
ThreadedGenerateData/AfterThreadedGenerateData
methods,
implementation
of
a
threaded
operation with itk::DomainThreader consists of implementation of its BeforeThreadedExecution,ThreadedExecution/
AfterThreadedExecution.
IMPROVED DOMAIN SUPPORT
In the traditional multi-threading infrastructure, the
itk::Image data was partitioned by splitting the output
itk::ImageRegion into a contiguous, non-overlapping subdomain itk::ImageRegions. In the new infrastructure, a
new class, itk::ThreadedDomainPartitioner, provides the
abstraction so that many different types of domains can
be partitioned into sub-domains to be processed in each
thread. Currently, an itk::ThreadedImageRegionPartitioner
is implemented along with an itk::ThreadedIndexedContain
erPartitioner and an itk::ThreadedIteratorRangePartitioner.
The itk::ThreadedIndexedContainerPartitioner will split up a
container that is indexed by integers, such as an itk::Array.
The itk::ThreadedIteratorRangePartitioner will partition itk
or STL iterator ranges into sub-ranges to be processed in
each thread.
itk::ThreadedImageRegion
Partitioner<VDimension>
itk::LightObject
itk::Object
itk::ThreadedDomainPartitioner
<TDomain>
itk::ThreadedIndexedContainer
Partitioner
itk::ThreadedIteratorRange
Partitioner<TIterator>
FUTURE IMPROVEMENTS
Further information can be found on ITKBarCamp [1] and
ITKExamples [2].
[1]http://insightsoftwareconsortium.github.com/
ITKBarCamp-doc/ITK/WriteMultiThreadedCode/index.html
[2]
http://itk.org/ITKExamples/Examples/Core/Common/
DoDataParallelThreading/DoDataParallelThreading.html
With the current ITK threading backends, parallelization will
not necessarily improve algorithmic performance. In practice, overhead associated with spawning and joining threads
to perform threaded operations can easily overshadow the
time required to perform the operation itself. Therefore,
adoption of a threading pool infrastructure in ITK could
greatly improve multi-threading performance.
ACKNOWLEDGEMENTS
The ITKv4 effort was made possible by funding from the
American Recovery and Reinvestment Act (ARRA) via the
US National Institutes of Health (NIH) National Library of
Medicine (NLM) .
Additional authors who contributed to this article include
Dr. Brian B. Avants, Michael Stauffer, and Baohua Wu at
the University of Pennsylvania; Nicholas Tustison at the
University of Virginia; and Dr. Arnaud Gelas of Crisalix SaaS.
Matthew McCormick is a medical imaging
researcher working at Kitware, Inc. His
research interests include medical image
registration and ultrasound imaging. Matt is
an active member of scientific open source
software efforts such as the InsightToolkit,
TubeTK, and scientific Python communities.
A VTK SELECTION ALGORITHM BASED
ON DIGITAL TOPOLOGY
At the Direction des Applications Militaires Île-de-France
(DIF) of the Commissariat à l'Energie Atomique (CEA), France,
a domain-specific visualization tool based on VTK and a
ParaView server has been developed. The goal of this article
is to report on a novel, topology-based selection algorithm
which was added to VTK 6 as part of this collaboration.
DIGITAL TOPOLOGY
Digital topology is a part of combinatorial topology which
focuses on the neighborhood relation within a grid structure;
specifically by “grid” we understand a conforming mesh in
2D or 3D. This is a slight extension of the standard definition
of grid cell topology, which by definition studies the case
of n-dimensional topological cubes, i.e., quadrilaterals in 2D
and hexahedra in 3D. In our case, however, we accept hybrid
meshes; i.e., meshes containing different types of elements
(e.g., triangles and/or quadrangles in 2D, hexahedra and/or
pyramids, tetrahedra, wedges, and knives in 3D).
It is beyond the scope of this short article to provide a full
exposition of grid cell topology. It will instead suffice to gain
an intuitive understanding thereof, by knowing that it is
based for the rest of this article on the notions of 8-adjacency in 2D and 26-adjacency in 3D; two distinct grid cells
are said to be separated by a Distance of 1 if and only if they
share at least one topological entity (vertex, edge, or face).
The Digital Distance between two grid cells is henceforth
defined by counting the smallest number of successive such
adjacencies needed to reach one cell from another one. The
example below illustrates this intuitive definition, which is
sufficient for our purpose of defining selection based on cell
distance within a mesh.
The figure above shows a hybrid mesh (containing both
triangles and quadrilaterals) together with a number of
"seed" cells, marked with black glyphs. Those cells that lie
at Distance 1, in the sense of 8-adjacency, from any of these
seed cells (excluding the latter when at Distance 1 from
another seed) are marked with a blue cross. The same is done
for a Distance of 2, with violet disk markings. This illustrates
the notion of topology generated by 8-adjacency distance,
which can readily be extended in the three-dimensional
case, with the 26-adjacency distance.
CELL DISTANCE SELECTOR
A class for the selection of cells at a given 8- or 26-adjacency
distance D from a given set of seed cells, within a composite
11
data set, had been developed at CEA/DIF. In the context of
this collaboration, we developed a new subclass of vtkSelectionAlgorithm, called vtkCellDistanceSelector, in order to
port this functionality to the upcoming release of VTK 6. We
preserved almost all of the original API, except for the choice
of input ports, which we modified as follows to comply with
typical VTK usage.
Input port 0: The input mesh, in the form of a vtkCompositeDataSet. In practice, the permitted leaf block types are
vtkPolyData, vtkStructuredGrid and vtkUnstructuredGrid.
Input port 1: The input set of seed cells, in the form of a
vtkSelection instance atop the above.
This change therefore does not preserve backwards compatibility for the CEA/DIF visualization application, but we
mitigated this by providing input port connection convenience methods that make the indexing of ports transparent
to the user. With these, it will thus be straightforward to
modify the aforementioned application so that it will set the
inputs correctly.
In addition, we modified the original method by allowing for
the inclusion or exclusion of cells located at an intermediate
digital distance between 0 and D, so a topological ball rather
than a disk might be selected. This is done by the means
of the AddIntermediate instance variable, which by default
is set to 1. Note that the implementation already allowed
for the inclusion or exclusion of the seed cells themselves,
with the IncludeSeed instance variable which is also set to 1
by default.
RESULTS 2D
We illustrate the method first with a set of 2D test cases,
which allow for convenient representation of grid cell topology relationships within a mesh. For this purpose, we make
use of the same mesh and seed cells as in the example above,
from which we define four test cases by grouping the seed
cells in four sets: isolated interior cell, corner cell, set of four
ridge cells, and set of three interior cells, two of which are
neighbors and the third is isolated. For instance, assuming
that mesh points to an instance of a vtkMultiBlockDataSet,
with a single leaf containing a vtkUnstructuredGrid meshing
the domain of interest, a selection of the seed cells with ID
972 is created as follows:
vtkSmartPointer<vtkIdTypeArray>
arr = vtkSmartPointer<vtkIdTypeArray>::New();
arr->InsertNextValue( 972 );
vtkSmartPointer<vtkSelectionNode> selNode =
vtkSmartPointer<vtkSelectionNode>::New();
selNode->
SetContentType( vtkSelectionNode::INDICES );
selNode->SetFieldType( vtkSelectionNode::CELL );
selNode->GetProperties()->Set(
vtkSelectionNode::COMPOSITE_INDEX(), 1 );
selNode->SetSelectionList( arr );
vtkSmartPointer<vtkSelection>
sel = vtkSmartPointer<vtkSelection>::New();
sel->AddNode( selNode );
With this selection, sel, defining the singleton to be used as
the first seed cell set, we now initialize a cell distance selector with up to a Distance of 2 from the cell, simply as follows:
vtkSmartPointer<vtkCellDistanceSelector> cds =
vtkSmartPointer<vtkCellDistanceSelector>::New();
cds->SetInputMesh( mesh );
cds->SetInputSelection( sel );
cds->SetDistance( 2 );
Executing the filter cds results in the creation of a new
selection, which can further be extracted with the
vtkExtractSelectionFilter. Setting up the three other examples can be done in a similar fashion and is left to the reader
as an exercise.
Specifically, the figure above shows a wireframe (light grey)
representation of a 3D mesh and the following extracted
selections, where the distance is now taken in the 26-adjacency sense:
Red: The ball with Radius 2, centered at a unique seed cell
deep within the interior of the mesh.
Green: The layer of all cells at Distance 1 (exactly) of a set of
four contiguous seed cells along a mesh ridge.
Orange: The layer of all cells at Distance 0 or 2 (but not 1) of
a set of a unique corner seed cell.
Yellow: The set comprising all cells within Distance at most 1
from a set of three non-contiguous seed cells.
These four test cases are also used for non-regression testing,
by the means of the TestCellDistanceSelector3D test program
distributed with VTK 6.
CONCLUSION AND FUTURE WORK
The picture above shows the four chosen sets of seed cells.
The following four sub-mesh extractions are then performed
from the output of the cell distance selector instances:
Red: The disk with Radius 2, centered at a unique seed cell
deep within the interior of the mesh.
Green: The layer of all cells at Distance 1 (exactly) of the set
of four contiguous ridge seed cells.
Orange: The layer of all cells at Distance 0 or 2 (but not 1) of
a set of the isolated corner seed cell.
Yellow: The set comprising all cells within Distance at most 1
from the set of three non-contiguous seed cells.
The picture also shows the resulting sub-meshes extracted
by the vtkExtractSelection filter from the output of vtkCellDistanceSelector for each of these four selections, using
the same color encoding. Here we can readily see how this
topological selection mechanism facilitates the extraction of
sub-regions of interest with an intuitive and fast method of
interaction.
These four test cases are used for non-regression testing, by
the means of the TestCellDistanceSelector2D test program
distributed with VTK 6.
RESULTS IN 3D
The test cases above are now extrapolated in 3D, with identical seed sell settings (isolated, connected, or non-connected
sets thereof) and the same distance requirements: equal to 1,
up to 2, exactly equal to 0 or 1, or up to 1. The figure below
illustrates these selections when applied to a 3D mesh.
A new, topology-based selection mechanism was added to
VTK. This continues the expansion of the capabilities of VTK
in terms of mesh manipulation and data analysis. Several
continuation tasks can already be considered and proposed:
Similar to what has been done for the linear selector, it would
be useful to have a widget allowing for convenient seed cell
selection within a mesh. Coupled with the cell distance selector, such a widget would allow for interactive extraction of
subsets of interest within a mesh that are in the vicinity or
at a given distance of cells known to be of particular interest
(e.g., corresponding to particular conditions within a simulation or with respect to the topology of the mesh).
Another useful addition would be to allow for different types
of grid cell topologies; in addition to the current topologies
based on 8- and 26-adjacency, in 2D and 3D respectively, the
range of selection possibilities would be drastically improved
by the addition of the topologies based on 4- and 6-adjacency.
Finally, it would be convenient to add other distance specification mechanisms, so that an arbitrary list of distances (akin
to a binary mask) could be passed as input parameters to the
selection algorithm. This would make the cell distance selector one step closer to a generic, topology-based sub-mesh
extraction tool.
ACKNOWLEDGMENTS
This work was made possible thanks to a contract with
CEA, Direction des Applications Militaires Île-de-France
(DIF), Bruyères-le-Châtel, 91297 Arpajon, France. We extend
special thanks to Guénolé Harel, Thierry Carrard, and Claire
Guilbaud, for this fruitful collaboration. We are looking
forward to continued collaboration with CEA/DIF in this and
other areas of scientific visualization.
Philippe Pébay is Technical Expert
in
Visualization and HPC at Kitware SAS, the
European subsidiary of the Kitware group.
Pébay is currently one of the most active
developers of VTK, an open-source, freely
available software system for 3D computer
graphics, image processing, visualization,
and data analysis. He is in particular the main architect
of the statistical analysis module of VTK and ParaView.
12
HIGH QUALITY SOFTWARE PRACTICES
BUILD AN EXTENSIBLE MEDICAL
IMAGE ANALYSIS PLATFORM
With funding from the NIH via the Neuroimaging Analysis
Center (NAC) and the National Alliance for Medical Image
Computing (NA-MIC), Kitware has been collaborating on
the refactoring and enhancement of 3D Slicer. Slicer is a
tool for visualizing and quantifying medical images and
related biomedical data. It provides advanced visualization,
segmentation (e.g., boundary delineation), and registration
algorithms that work with a wide variety of medical images:
MRI, CT, PET, ultrasound, microscopy, and more. Slicer is a
bridge between laboratory research, clinical studies, and
patient care.
As re-affirmed with the release of version 4.2, Slicer has
become a stable, extensible, and powerful platform for
medical imaging analysis. It is now a shining example of
the high-quality software practices and community support
that are enabled by Kitware technologies, e.g., VTK, CMake,
CTest, CDash, and Midas.
Slicer’s success and impact are indicated by many factors.
It is being distributed on Windows, Linux, and MacOS. It
has been downloaded over 48,000 times since the release
of Slicer 4.0 in November 2011, and has been featured in
tutorials given to thousands of researchers at workshops and
conferences. It is being distributed with modular extensions
created by academic and industry partners from around the
world to solve challenging medical problems.
SLICER APPLICATIONS
Slicer is most famous for the variety and significance of the
medical image analysis findings that it has helped generate.
A collection of over 125 research projects that are using Slicer
is posted at http://slicer.org/pages/Slicer_Community. There
are many more such projects that are not listed, because the
Slicer community is a very large and very open. The following list of Slicer-based projects is meant to highlight some of
the diversity and impact of Slicer.
Longitudinal MRI Study of Early Brain Development in
Neuropsychiatric Disorder-Autism:
The primary goal of this UNC research project is to learn
more about autism by examining cortical thickness patterns
in the early developing brain. Increasing evidence indicates
that brain volume in children with autism is enlarged relative to normal controls. Whether these differences are due
to increased cortical thickness or increased cortical surface
area, however, is less clear and is the focus of this project.
Figure 2. UNC developed the Automated Region Cortical
ThiCkness (ARTIC) plug-ins in Slicer to produce cortical
thickness measures for both individual and group analysis.
(Image and caption provided by the Neuro Image Research
and Analysis Laboratories and the Neurodevelopmental
Disorders Research Center at UNC).
Modeling the Mechanics of Atrial Fibrillation:
The Comprehensive Arrhythmia Research and MAnagement
(CARMA) Center at the University of Utah is a world leader in
the rapidly emerging field of MRI-managed evaluation and
ablation of atrial fibrillation. They are using Slicer to study
the tissue remodeling of the atrial wall that is a hallmark of
atrial fibrillation.
Figure 1. Kitware’s software development processes
includes a continuous development, build, and test
cycle (upper left). For Slicer, it has been augmented with
automated package generation and distribution for
multiple platforms (upper right). This has led to rapid
community growth (bottom) which feeds the software
development process.
This article provides a brief overview of the capabilities of
Slicer and then lists several of the key technologies used and
developed by the Slicer team to foster high-quality software
practices within the team.
13
Figure 3. The CARMA Center's Utah classification for Atrial
Fibrillation staging involves segmentation of the left atrial
wall from MRI, followed by quantification of enhanced
vs. non-enhanced voxels in the wall. (Image and caption
provided by CARMA).
Brain Tumor Resection Guidance:
Neurosurgical navigation systems have reduced the risk
of complications from surgery and have allowed surgeons
to remove tumors that were once considered inoperable.
However, many techniques used by neurosurgical navigation
systems to align pre- and intra-operative images are inaccurate when tissue deformations occur as the tumor is resected.
This project is developing algorithms and infrastructure for
deformable intra-operative image registration for neurosurgical guidance. The new methods employ ultrasound
physics and geometry from the pre-operative images in the
registration metrics. This NIH R01 is a collaboration between
Brigham and Women’s hospital, with Kitware, InnerOptics,
and Duke University.
SLICER’S HIGH QUALITY SOFTWARE PROCESSES
AND TECHNOLOGIES
Slicer has achieved its capabilities and recognition by building upon and extending complimentary open-source efforts.
Many of these have been significantly enhanced by the Slicer
development effort, and those enhancements have been
contributed back to their originating projects. Slicer has
also fostered the creation of new toolkits and processes. The
list below highlights some of Slicer’s most notable enhancements and creations:
Python Integrations: Based on capabilities developed for
ParaView, we have integrated a Python interface into Slicer.
An interactive Python session can be started from within
Slicer and have full access to the GUI, data, and algorithmic
plug-ins of Slicer. Python can be used to define new algorithms and interfaces in Slicer, and scripts can be shared via
the Slicer Catalog (below).
CTK: Slicer is one of the foundational toolkits that motivated
and contributed to the development of the Common Toolkit
(CTK). This new toolkit features a variety of Qt, DCMTK, and
VTK-specific GUI elements and utilities for DICOM object I/O,
DICOM query and retrieve, run-time loadable plug-ins, and
the control of medical image displays.
Figure 4. Slicer is used to aggregate tracked intra-operative
ultrasound, pre-operative MRI, and novel registration
algorithms to display fused MRI and ultrasound images
during procedures.
Assessment of Traumatic Brain Injury:
Nearly 1.7 million Americans suffer traumatic brain injury
(TBI) annually, e.g., from car accidents, contact sports,
gunshots, and improvised explosive devices. This project
is investigating new methods for studying longitudinal
changes in patient images to assist in predicting outcomes
and prescribing treatments.
Atlas-based segmentation
methods and deformable registration in the presence of
changing pathologies are being developed. This effort is
lead by UCLA with collaborators at The University of Utah,
UNC, and Kitware.
Figure 6. A complete DICOM query and retrieve system
is available as a Qt widget in CTK. Image data can be
searched and downloaded from a clinical or research PACS.
CMake’s Superbuild: Slicer has exploited CMake’s ability to
integrate external projects during compilation. Slicer downloads and/or builds over 35 external project dependencies
during its compilation. The CMake team has worked with
Slicer developers to ensure error reporting, diverse repositories, and build requirements are handled smoothly by
Superbuild for Slicer.
Figure 5. Example acute vs. chronic registration of TBI using
geometric metamorphosis. Left: Schematic of registration
framework. Right: Acute and chronic TBI images;
background flow overlaid on chronic scan; pathology flow
(recession areas shown in blue) overlaid on deformed acute
pathology.
To achieve these diverse and cutting-edge algorithmic and
visualization capabilities on a community supported software development project requires the establishment of a
strong yet unobtrusive set of high-quality software practices
and technologies. Those are explained next.
14
Semi-Automated Wiki Documentation: The plug-ins of Slicer
have integrated documentation that describe their inputs,
parameters, operation, and outputs. The Slicer team has
devised a method for automatically posting those descriptions to a wiki page for each plug-in. Those wiki pages can
then be directly edited to include additional details. By
automating the initial generation of these wiki pages, a
base level of documentation is assured.
GUI Regression Testing: Building upon Qt technology developed by the ParaView team, the numerous tutorials of Slicer
(http://www.slicer.org/pages/UserOrientation) have been
converted to automated tests of the Slicer GUI. These tests
can be run every night, on a multitude of machines, as part
of the CTest / CDash build-test process used by Slicer.
Midas: The data required to test, demonstrate, and apply
Slicer is massive and diverse. Kitware’s Midas is being used
and improved to host and distribute these data via a variety
of APIs: web interfaces, Python libraries, desktop applications, and C++ libraries.
This technology was originally
developed for ITK, so that testing data can be downloaded
as needed from Midas, instead of requiring it to be packaged with each ITK (or Slicer) source code download. Testing
data is versioned so that it is tied to specific code revisions.
The linkages between CTest, CDash, run-time applications,
and Midas continue to be expanded and refined.
Collect and distribute the products of the nightly dashboards: Every night, tens of machines build and test Slicer
and submit summary reports to the Slicer dashboard via
CDash. The Slicer team worked with CDash developers to
extend that process, so that the compiled binaries and installation packages created as part of the testing process are
automatically made available from the Slicer dashboard. A
new package icon will appear next to a build-test report on a
Slicer dashboard when its associated build-test products are
available. That icon’s link will lead to a list of the associated
binaries, test results, and packages that can be downloaded
from that build-test run. This capability is now available on
any new CDash installation.
Installation package management: An installation package
manager module was created for Midas using CDash’s ability
to collect the products of nightly dashboard. Via that Midas
module, http://slicer.kitware.com now provides user-friendly
access to the Slicer installation packages created by the
nightly dashboard machines. From that website, installation
packages for Slicer are available for a multitude of systems,
e.g., various versions of Windows, MacOS, Ubuntu, and
Debian Linux. A variety of package management tools have
been added to this Midas module, e.g., packages can be
preserved as stable or experimental releases. This technology is now being ported to other Kitware projects via http://
download.kitware.com.
Slicer Catalog, aka Slicer Extension Manager: Building upon
the Midas installation package management system, the
Slicer, CDash, and Midas teams created the Slicer catalog
which functions as an “app store” for Slicer plug-ins. It can
be accessed over the web and from within Slicer. This infrastructure enables researchers to publish the code of their
Slicer plug-ins so that their code is then compiled during the
nightly build-test cycle of Slicer, and the resulting plug-ins
are then stored by Midas and made available when the corresponding version of Slicer is run.
CONCLUSION
The combination of open source, high quality software processes, and advanced visualization and analysis algorithms
from world-class collaborators have been the hallmark of
Slicer. This article provides a glimpse into the diversity of
applications enabled by Slicer and the high-quality software
processes and technologies used to build and maintain it.
It highlights that by building upon and contributing back
to existing open source efforts (e.g., ParaView and CDash),
by integrating those efforts to build technologies that can
be re-used by other projects (e.g., Midas for installation
package management), and by creating new toolkits that
conform to existing standards and fill gaps in existing practices (e.g., CTK), a thriving and broad community is fostered.
Kitware continues to expand its contributions to the Slicer
community and to transitions those developments to its consulting practices. With the blessing of the Slicer community
we now offer a variety of Slicer-related consulting services.
These services include the development of custom plug-ins
for advanced data analysis and visualization problems, the
re-packaging of Slicer for streamlined applications to specific
problems, the integration of Slicer into existing workflows
and infrastructures, and the development of entirely new
applications based on Slicer and its high-quality software
practices and technologies. All products that result do not
have recurring licensing fees. If you are interested in exploring ways in which Slicer can help your commercial efforts,
please contact [email protected].
Stephen Aylward is the Senior Director of
Operations at Kitware's North Carolina
Office. Dr. Aylward is also an Associate Editor
of IEEE Transactions on Medical Imaging, is a
member of the Pattern Recognition Society,
serves on the SPIE Medical Imaging: Image
Processing program committee.
Jean-Christophe Fillion-Robin is an R&D
Engineer at Kitware. His research interests
include swarm intelligent systems, bioinspired systems, and cognitive psychology.
He is an active developer of the Slicer toolkit.
Julien Finet is an R&D Engineer at Kitware.
He is involved in numerous projects in the
medical team. He is notably a lead developer
for the Slicer, CTK, and MSVTK projects.
Zach Mullen is an R&D Engineer at Kitware.
He is a developer for the CMake, CDash, and
Midas open source quality software process
tools.
Figure 7. This Slicer Catalog page is available from within
Slicer or over the web for simplifying the distribution and
installation of plug-ins for Slicer. This catalog page and
underlying data are built upon Midas and can be re-used
in other applications having plug-ins.
15
KITWARE NEWS
KITWARE TO DEVELOP AUTONOMOUS ROBOT
NAVIGATION SYSTEM
Kitware was awarded $100,000 in Phase I SBIR funding from
the U.S. Army to develop a novel robot navigation system
that is based on high-level landmarks for use in military and
search-and-rescue applications. Autonomous robots can
improve safety and situational awareness in a wide range of
military and commercial intelligence applications.
This Phase I effort will be led by Dr. Amitha Perera, Technical
Leader at Kitware, and is a collaboration of Kitware’s computer vision expertise and Texas A&M University’s renowned
robotics capabilities. The team will develop a new robot navigation paradigm that incorporates a multilayered feature
graph (MFG) based on high-level visual landmarks into the
supervisory control system. The MFG-derived results will be
combined with segmentation techniques to extract salient
landmarks and build a 3D scene model. Operators will then
issue high-level commands such as “follow this wall” or
“go around that object” that the robot will autonomously
execute in real-time. This new navigation paradigm will
provide the operator with an increased sense of telepresence and situational awareness, leading to more accurate
completion of tasks.
PARAVIEW AWARDED HPCWIRE EDITOR’S
CHOICE AWARD AT SC12
As part of the 2012 International Conference for High
Performance Computing, Networking, Storage, and Analysis
(SC12), ParaView was selected as HPCwire’s Editors’ Choice
Award for Best HPC Visualization Technology. This marked
the third consecutive year that Kitware has been recognized by HPCwire, winning an Editors’ Choice award for the
Visualization Toolkit (VTK) in 2011, and both Editors’ and
Readers’ Choice awards for ParaView in 2010.
The HPCwire Editors’ Choice Awards are determined through
a rigorous selection process, where winners are chosen by a
panel of editorial and executive staff, recognized HPC dignitaries, and contributing editors from across the industry.
Dr. Berk Geveci, Director of Scientific Computing at Kitware,
accepted the award on behalf of the ParaView community,
whose key sponsors include Sandia National Laboratories, Los
Alamos National Laboratory, the Army Research Laboratory,
and the National Nuclear Security Administration’s Advanced
Simulation and Computing (ASC) Program.
This material is based upon work supported by the United
States Army under Contract No. W56HZV-12-C-0408. Any
opinions, findings and conclusions or recommendations
expressed in this material are those of the author(s) and do
not necessarily reflect the views of the United States Army.
DARPA FUNDS A VISUALIZATION DESIGN
ENVIRONMENT
DARPA has awarded Kitware with $3,981,353 in XDATA
funding to develop an innovative Visualization Design
Environment (VDE), an open-source library of tools for
enabling the rapid development of large-scale data visualization interfaces by novice programmers.
Dr. Jeff Baumes, Technical Leader at Kitware, will lead
VDE’s development in collaboration with world-class
visual interface and design experts from KnowledgeVis,
Stanford University, Harvard University, Georgia Institute of
Technology, and the University of Utah.
The goal of the VDE is to support rapid development of
visual interfaces. Current tools support numerous visualization types, but are typically programming-intensive and
challenging for novice programmers to use. This project will
investigate interactive graphics techniques that reduce the
programming burden on the user and even automatically
suggest effective parameters for visual representations.
The result will be a state-of-the-art, open-source library of
aggregation, querying, and visualization tools, which will
allow analysts to move smoothly between very high-level
summaries and individual documents. VDE will integrate
with the larger XDATA effort to produce an extensive opensource platform for analysis and visualization of big data.
The Visualization Design Environment is sponsored by the
U.S. Air Force Research Laboratory (AFRL) under contract
number FA8750-12-C-0300.
16
KITWARE ATTENDS MIL-OSS WG4
Kitwareans Luis Ibáñez and Chuck Atkins attended Mil-OSS
WG4 meeting in Washington, D.C., where they presented
the talk “Strategic Open Source: From Healthcare to
Intelligence.” The talk focused on the difficulties and challenges within the healthcare and intelligence communities,
and how leveraging open-source solutions can address these
and benefit the community.
In addition to the talk, the event provided a great venue
for discussions on how to facilitate open-source software
within the government. In particular, discussions focused on
aligning the agile culture of open-source communities with
the more structured culture of government agencies, such
as when dealing with Federal Acquisition Regulations (FAR),
and satisfying Export Control (EC) requirements.
KITWARE MAKES AN IMPACT AT MICCAI 2012
Kitware had a very active presence at the 15th International
Conference on Medical Image Computing and Computer
Assisted Intervention (MICCAI) in October. As part of the
event, Dr. Stephen Aylward, Kitware’s Senior Director of
Operations in North Carolina, was elected to the MICCAI
Society’s Board of Directors. Dr. Aylward has been an active
conference contributor of workshops and tutorials, including the first workshop, “Open-Source Software for MICCAI”
in 2005.
Last year, Dr. Aylward worked with conference organizers
to establish the Young Scientist Publication Impact Award,
a new tradition sponsored by Kitware that recognizes a
researcher whose MICCAI work had an impact on the field
in terms of citations, secondary citations, subsequent publications, and h-index. This year the award went to Caroline
Burn from the University of Pennsylvania for her paper “A
Tensor-Based Morphometry Study of Genetic Influences on
Brain Structure using a New Fluid Registration Method,”
authored by C. Brun, N. Lepore, X. Pennec, Y.-Y. Chou, K.
McMahon, G.I. de Zubicaray, M. Meredith, M.J. Wright, A.D.
Lee, M. Barysheva, A.W. Toga, P.M. Thompson.
MICCAI also awarded the prestigious Young Scientist Award
to the first authors of the top five papers at the conference,
as voted on by an award committee from among the papers
with the highest weighted review scores. Roland Kwitt, an
R&D Engineer at Kitware, was awarded one of these Young
Scientist Awards for his paper "Recognition in Ultrasound
Videos: Where am I?" written in collaboration with Nuno
Vasconcelos, Sharif Razzaque, and Stephen Aylward. The
paper explores the use of ultrasound imaging as an inexpensive alternative to MRI and CT imaging for rural and
developing parts of the world.
KEN MARTIN NAMED “CFO OF THE YEAR”
Kitware co-founder, Chairman and CFO, Dr. Ken Martin was
named CFO of the Year by the Albany Business Review in
October. Dr. Martin was selected from among 100 nominations in the small business category. During the awards
ceremony, the Business Review noted Dr. Martin’s unique
background in physics, computer science, and electrical and
computer systems.
As CFO, Dr. Martin is responsible for the overall financial
management of the company, its financial reporting and
transparency, and for multiple corporate functions including contracting, compliance audit, legal, and long-range
planning. Ken has monitored and steered the growth of
Kitware without incurring debt or taking outside investments through careful management of the company’s assets
and planning for future revenue and expenditures. In addition to these responsibilities, he remains an active software
developer on projects such as CMake, on which he is one of
the lead architects and core developers.
EXPLORING THE “TREE OF LIFE”
Kitware is one of the collaborators on “Arbor: Comparative
Analysis Workflows for the Tree of Life,” a $2M project funded
by the National Science Foundation. The project will provide
researchers and scientists with workflow-based visualization
and analysis tools that will allow them to explore their vast
quantities of data to quickly understand how organisms are
interrelated and how they interact in geographical space
and geological time. This new evolutionary-based research
may help fuel future discoveries in the fields of medicine,
public health, agriculture, ecology, and genetics.
The project will be developed as part of a unique collaboration between researchers at universities including the
University of Idaho, UC Berkeley, the University of Alabama,
the University of Kansas, and the University of Central
Florida; and private industry researchers at Kitware and
KnowledgeVis.
Dr. Wesley Turner will be the principal liaison for Kitware,
and will lead the integration of Arbor algorithms into an
accessible, modular application; the development of effective data visualization methods; and the support necessary
to nurture and grow an open-source community around
the platform. The first version of Arbor is scheduled to be
released later this year, with updates and expanded operations to follow over the course of the three years of funding.
KITWARE PARTICIPATES IN AVSS
In October, Kitwareans Anthony Hoogs and Sangmin Oh from
the Computer Vision team attended the IEEE Conference
on Advanced Video and Signal-Based Surveillance (AVSS) in
Beijing, China.
In collaboration with Michael Ryoo from the Jet Propulsion
Laboratory at CIT, Anthony, Sangmin, and Arslan Basharat
organized and presented the tutorial "Activity Recognition
for Visual Surveillance," which provided an overview of
human activity recognition approaches and their applications to surveillance systems.
Kitware also presented two papers at the conference.
Anthony presented the paper "Human Action Recognition
in Large-Scale Datasets Using Histogram of Spatiotemporal
Gradients," by Kishore K. Reddy (University of Central
Florida), Naresh Cuntoor, Amitha Perera, and Anthony Hoogs.
Sangmin and Anthony also presented "Robust Orientation
and Appearance Adaptation for Wide-area Large Format
Video Object Tracking," written by Rengarajan Pelapur,
Kannappan Palaniappan, and Gunasekaran Seetharaman,
collaborators at the University of Missouri and Air Force
Research Laboratory.
17
In addition to presenting tutorials and papers, Anthony
was a session chair for the oral session on Action/Activity
Recognition, as well as a panelist on the Industrial Panel
with seven other senior members of the video surveillance
research community.
KITWARE EUROPE CELEBRATES 2ND
ANNIVERSARY
methods of performing in-situ visualization using Catalyst,
ParaView's co-processing library. Attendees learned various
methods of applying Catalyst in their projects including how
to build pipelines for Catalyst; how the API is structured;
how to bind it to C, C++, Fortran, and Python; and how to
build Catalyst for HPC architectures..
November 8th marked the second anniversary of Kitware
Europe in Lyon, France. Since its inception, the Lyon office
has been actively providing professional training courses
throughout Europe. They’ve also made important collaborations with organizations such as the French Nuclear Agency
(CEA) to develop new tools for VTK and ParaView, and with
IRCAD to develop mobile and online visualization tools that
use VES and Midas.
Kitware Europe is now a member of the European pole of
competence in high-performance simulation (Ter@tec), and
the Lyon-Biopole French pole of competence, for which
they’ve been awarded a grant for performing magnetic resonance image simulation with several other French institutes.
Their success has led to the addition of new team members,
upping the count to five. Looking to the future, Kitware
Europe plans to expand their course offerings, and continue
seeking consortium opportunities to contribute technical
and open-source expertise for the advancement of science.
KITWARE ATTENDS SUPERCOMPUTING 2012
Kitware attended, exhibited, and presented at SC12 in Salt
Lake City, Utah in November. This was Kitware's first year
with an exhibit booth at the conference, and our team
was able to interact with many of our HPC customers and
collaborators.
In addition to exhibiting, we presented two collaborative
tutorials. The first, "Large Scale Visualization with ParaView"
was presented by Kenneth Moreland, W. Alan Scott, and
Nathan Fabian from Sandia National Laboratories, and
Utkarsh Ayachit and Robert Maynard from Kitware; this
tutorial featured guidance on visualizing the massive simulations run on today's supercomputers, and introduced the
audience to scripting and extending ParaView.
The second tutorial, "In-Situ Visualization with Catalyst"
was led by Nathan D. Fabian and Ron A. Oldfield from
Sandia National Laboratories; Andrew Bauer and Utkarsh
Ayachit from Kitware; and Norbert Podhorszki from Oak
Ridge National Laboratory. This tutorial demonstrated two
18
We're looking forward to attending SC13 in Denver,
Colorado, where we will be exhibiting in booth #4207.
PARAVIEW FOR CLIMATE SCIENTISTS PRESENTED
AT AGU FALL MEETING
Aashish Chaudhary attended the 45th annual meeting of
the American Geophysical Union in San Francisco, CA this
December, where he presented “ParaView for Climate
Scientists”. His talk described the integration of ParaView
with the Ultrascale Visualization- Climate Data Analysis Tools
(UV-CDAT) framework, a powerful and complete front-end
to a rich set of visual-data exploration and analysis capabilities well suited for climate-data analysis problems. The
integration leverages ParaView’s Python interface, and
includes various new features that have been added to both
tools to target climate data, including new readers, filters,
and parallel spatiotemporal capabilities.
MARCUS HANWELL FEATURED ON NATURE
SOAPBOX SCIENCE SERIES
Marcus Hanwell, Technical Leader at Kitware was featured in
a Soapbox Science series on Nature.com this Fall. His article,
"PhDelta: The Road Less Travelled - From PhD to Software
Development" discusses how his journey from studying
Experimental Physics in academia to becoming a software
developer in industry was largely influenced by the ideals of
Open Access publishing and open dissemination of scientific
knowledge.
Dr. Hanwell leads the Open Chemistry project at Kitware,
which includes the development of a suite of open-source
tools to tackle big problems in chemistry, biochemistry,
materials sciences, and other related areas.
1. HARDI Reconstruction Challenge
2. 3D Deconvolution Microscopy Challenge
3. Computer Aided Detection of Pulmonary Embolism
4. 3D Segmentation of Neurites in EM Images
5. Automated Segmentation of Prostate Structures
6. Localization Microscopy Challenge
7. Cell Tracking Challenge.
KITWARE ATTENDS GEOINT 2012
Kitwareans Anthony Hoogs, Matthew Turek, Lisa Avila,
Katie Sharkey, and Katie Osterdahl attended the GEOINT
Symposium in Orlando, Florida in October. This conference is
put on by the U.S. Geospatial Alliance Foundation each year,
and brings together the nation's intelligence community.
This year was Kitware's first time attending and exhibiting
at the event, and marked the debut of the Kitware booth.
KITWARE CELEBRATES THE NEW YEAR
At the end of every year, each Kitware location has a grand
celebration. From our teams and families to yours, we wish
you a very happy and productive New Year!
The Kitware celebration in Clifton Park, NY.
Over the course of the three day event, our team met with
numerous industry officials and military personnel and discussed the Computer Vision team's expertise in full motion
video, wide area motion imagery, and activity and event
detection.
The Kitware celebration in Carrborro, NC.
UPCOMING CONFERENCES AND EVENTS
ParaView Course at the Finnish Supercomputing Center
January 15-17th in Espoo, Finland
Philippe Pébay will teach this two-day course at the CSC-IT
Center for Science. The course provides a hands-on overview of the ParaView visualization application. The basic
interactive visual exploration process is demonstrated,
including data loading, data processing, adjusting parameters, and data interaction. Key concepts such as cutting,
clipping, contouring, probing, and glyphing will be
discussed, and examples of generating output in the form
of processed data, rendered images, and animations will be
demonstrated.
The Kitware celebration in Lyon, France.
International Symposium on Biomedical Imaging (ISBI) 2013
April 7-11th in San Francisco, CA
Stephen Aylward and Brad Davis are attending. Stephen is
chairing the Grand Challenges with Bram van Ginneken from
Radoud University Nijmegen. The ISBI Grand Challenges will
be held on Thursday, April 11th, the last day of the conference. Grand Challenges for this year include:
19
The Kitware celebration in Santa Fe, NM.
NEW HIRES
Amir Sadoughi
Amir Sadoughi joined the Clifton Park, NY office as an intern
on the medical team in December. He recently completed his
Ph.D. in mechanical engineering at Rensselaer Polytechnic
Institute in Troy, NY, for which he has been developing a new
technique for studying monolayer behavior through acquisition and analysis of microscopy images.
Claudine Hagen
Claudine Hagen joined the Clifton Park, NY office as the
Director of Finance in January. Claudine has extensive
experience in providing accounting and auditing services
for a wide range of industries. Before joining Kitware, she
worked as an Audit Senior Manager for KPMG in Albany, NY.
She holds a B.S. in Accounting from the State University of
New York at Albany, which she received Summa Cum Laude.
Jake Stookey
Jake Stookey joined the Clifton Park, NY office as a Systems
Administrator. He is experienced from having worked in a
similar role at Rensselaer Polytechnic Institute. At RPI, Jake
administered Linux servers for faculty, staff and students,
maintained the network, and performed automated 24/7
monitoring, alert, and backup systems for all services. He
holds a M.S. in Computer Systems Engineering.
KITWARE INTERNSHIPS
Kitware Internships provide current college students with
the opportunity to gain hands-on experience working with
leaders in their fields on cutting edge problems. Our business model is based on open source software—an exciting,
rewarding work environment.
Our interns assist in developing foundational research and
leading-edge technology across six business areas: supercom-
In addition to providing readers with updates on Kitware
product development and news pertinent to the open
source community, the Kitware Source delivers basic information on recent releases, upcoming changes and detailed
technical articles related to Kitware’s open-source projects.
For an up-to-date list of Kitware's projects and to learn
about areas the company is expanding into, please visit the
open source pages on the website at http://www.kitware.
com/opensource/provensolutions.html.
A digital version of the Source is available in a blog format
at http://www.kitware.com/source.
Kitware would like to encourage our active developer
community to contribute to the Source. Contributions
may include a technical article describing an enhancement
you’ve made to a Kitware open-source project or successes/
lessons learned via developing a product built upon one or
more of Kitware’s open-source projects. Kitware’s Software
Developer’s Quarterly is published by Kitware, Inc., Clifton
Park, New York.
20
puting visualization, computer vision, medical computing,
data management, informatics and quality software process.
We offer our interns a challenging work environment and
the opportunity to attend advanced software training. To
apply for an internship, please visit our employment site
at jobs.kitware.com and submit a resume and cover letter
through our online portal.
EMPLOYMENT OPPORTUNITIES
Kitware is seeking talented, motivated and creative individuals to fill open positions in all of our offices. As one of the fastest
growing companies in the country, we have an immediate need
for software developers and researchers, especially those with
experience in computer vision, scientific computing and
medical imaging.
At Kitware, you will work on cutting-edge research alongside experts in the field, and our open source business model
means that your impact goes far beyond Kitware as you
become part of the worldwide communities surrounding
our projects.
Kitware employees are passionate and dedicated to innovative open-source solutions. They enjoy a collaborative
work environment that empowers them to pursue new
opportunities and challenge the status quo with new ideas.
In addition to providing an excellent workplace, we offer
comprehensive benefits including: flexible hours; six weeks
paid time off; a computer hardware budget; 401(k); health,
vision, dental and life insurance; short- and long-term disability, visa processing; a generous compensation plan;
yearly bonus; and free drinks and snacks. For more details,
visit our employment site at jobs.kitware.com
Interested applicants are encouraged to visit our employment site at jobs.kitware.com and submit a resume and
cover letter through our online portal.
Contributors: Lisa Avila, Giles Richardson, Luis Ibáñez,
Matthew McCormick, Xiaoxiao Liu, Marcus Hanwell, David
Lonie, Chris Harris, Brian Avants, Michael Stauffer, Baohua
Wu, Nicholas Tustison, Arnaud Gelas, Philippe Pébay, Stephen
Aylward, Jean-Christophe Fillion-Robin, Julien Finet, Zach
Galbreath, Julien Jomier, and Aashish Chaudhary.
Graphic Design: Steve Jordan
Editors: Katie Sharkey, Katie Osterdahl
To contribute to Kitware’s open-source dialogue in future editions, or for more information on contributing to specific
projects, please contact us at [email protected].
This work is licensed under a Creative Commons
Attribution 3.0 Unported License.
Kitware, ParaView, CMake and VolView are registered trademarks of Kitware, Inc. All other trademarks are property of
their respective owners.