Visualization as I see it.

Open Source Scientific Visualization (vs. Information Visualization)

A colleague recently sent me a link to ParaView, an open source scientific visualization tool.  In my day job I usually work with information visualization techniques but that wasn’t always the case;  my visualization career began with scientific visualization (a post-doc working with 3D confocal microscope data).

So what’s the difference between scientific and information visualization?  According to Wikipedia, scientific visualization is:

… primarily concerned with the visualization of three-dimensional phenomena (architectural, meteorological, medical, biological, etc.), where the emphasis is on realistic renderings of volumes, surfaces, illumination sources, and so forth, perhaps with a dynamic (time) component

and information visualization is:

… the visual representation of large-scale collections of non-numerical information, such as files and lines of code in software systems, library and bibliographic databases, networks of relations on the internet, and so forth

A couple of examples are shown at the end of this post.

Personally, I think the main difference is the domains from which the data to be visualized is drawn.  Scientific visualization tends to deal with data from the physical sciences; astronomy, meteorology, physics, engineering, geology, biomedicne etc. whereas information visualization works with data sets from different (non-physical) science domains and non-scientific domains.  As a result the nature of the data and the questions being asked also differ and lend themselves to different visualization techniques.  So, for example, scientific visualization makes use of volume rendering and vector field diagrams, which provide a direct view of physical data, whereas information visualization uses parallel coordinates and network diagrams which are abstract representations of non-physical data.  These are generalisations of course; information visualizations can be applied to physical data and scientific visualizations can be used with non-physical data, but as a general rule the difference is the data domains.

So, returning to ParaView, upon seeing it I wondered what other FOSS scientific visualization tools were available these days.  Here’s what I found:

  • ParaView: built atop of the venerable VTK visualization toolkit
  • VisIT: developed by the Department of Energy to visualize the results of terascale simulations
  • OpenDX: IBM open-sourced their Data Explorer product

If you know of any others then please let me know.

Scientific Visualization: Star formation

Information Visualization: Internet map


Infographic Introspection

Infographics cop a lot of flak from the data visualization community for such things as their superficiality and contravention of good design principles.

Several designers have taken such criticism on board and responded to it in the only way they know how … as infographics about infographics.

Phyl Gyford got the ball rolling with the infographic shown below:

Think Brilliant’s Dave Fields, then broke every rule in the book with his Intimate Look at Infographics:

Then Tanner Ringerud donned his tinfoil hat and spilled the dirt on The Truth About Infographics with an Infographic Backlash Infographic.  Apparently, the plethora of infographics flooding the Internet were all part of a grand conspiracy to game Digg!

E Factor Media decided to take the matter a more seriously with their Infographic.

Susannah Bandish explored the numbers behind 2010: the Year of the Infographic.

And most recently, Ivan Cash tried to cram every trick in the infographic playbook into his Infographic of Infographics.

If you come across any other infographics on infographics then please point them out to me.

Pervasive Visualization

Today I came across an interesting blog post listing 21 Everyday Visualizations.  As the title suggests the article lists 21 objects we encounter in everyday life that graphically encode information.  The article reminded me that visualization is pervasive; it’s not confined to the digital realm.

The objects identified include calendars, weather maps, traffic lights and scales.  The objects represent a fairly small set of visualization primitives:

  • maps
  • icons
  • gauges
  • dials
  • histograms
  • pie charts

For the full list visit the Epic. Graph blog.  If you can think of any other commonplace visualizations then please suggest them to Mark (the Epic. Graph blogger).  I’ve suggested the financial indicators used in business news (TV, print, web) to indicate daily movements in stocks, indices, exchange rates, commodity prices etc.

Data Wrangler & Google Refine

The productive folks over at the Stanford Visualization Group (also responsible for the wonderful Protovis toolkit) have released a new on-line tool called Data Wrangler:

Wrangler is an interactive tool for data cleaning and transformation.
Spend less time formatting and more time analyzing your data.

The video clip above shows Data Wrangler in action.

The tool is in alpha but looks very promising.

Google offer a similar tool, Google Refine, which takes a different approach to the problem of data preparation – see the video clip below for example.

So, we’re spoiled for choice when it comes to free, high quality tools for preparing data for visualization.  Next time you have a data set that needs scrubbing before presentation give Data Wrangler or Google Refine a go…

Number Picture: A New Kind of Social Visualization Tool

Number Picture is a new web-site whose aim is to provide a tool to enable people to visualize their data.  In this regard it is similar to previous social visualization efforts such as IBM’s Many Eyes, Tableau Public and the (now defunct) Swivel.  Where Number Picture differs considerably from these other tools is that the crowd-sourcing emphasis is on the visualization designs rather than the data – that is, users of the site are encouraged to contribute new visualization designs, called templates, with which to visualize data.

Currently, there are more than 30 templates.  Hopefully, this number will grow as designers contribute new templates to the site.  Templates are written using the Processing.js toolkit.  Instructions are provided to get you started.

Above is shown a visualization I created using Number Picture with data sourced from ONE The Data Report 2011.  The same data visualized using a different template is shown below.

Great Linux World Map

As a visualization and Linux enthusiast I couldn’t resist drawing attention to The Great Linux World Map published on the Dedoimedo blog.  All the main distros are represented with lots of Linux puns and in-jokes.

It reminded me of the maps of Online Communities Randall Munroe has published over the years on his XKCD web comic.  Just as Munroe has had to update his maps as the Internet landscape has evolved so too we can look forward to future renditions of the Linux World Map in keeping with the tectonic changes afoot.

Osama bin Laden’s Death: Mixed Emotions

Tweets per second.

Upon news of Osama bin Laden’s death Twitter recorded the highest sustained rate of tweets in its history.  They provided the annotated chart shown above in which the number of tweets per second exceeds 3000 for more than one and a half hours.

In the days that followed the public response has run the full gamut of emotions from wild rejoicing to deep sorrow, and I’ve found myself wondering what the prevailing mood of the population has been.  The New York Times has attempted to answer this question at least with regard to its readership.  They asked their readers:

Was his death significant in our war against terror? And do you have a negative or positive view of this event?

and 13,864 of them responded by plotting their answers on a graph and leaving a comment.  The graph is shown below; each dot represents a comment.  I couldn’t find any information regarding what the colour of each dot represents but I suspect it’s the number of respondents who clicked in that spot – the darker the colour the larger the number of clicks.

The chart is interactive allowing you to browse the respondents’ comments.  You can see that the positive & significant quadrant received the most responses.  There’s also a strong correlation between responses to the two questions (see the x = y cluster).  I can also see clusters of responses that represent strongly held positions: these lie in the corners of the chart and at the extremes of each axis.

Looking at the Mobile Patent Mess

The web of lawsuits filed for infringement of mobile technology patents grows more tangled with each passing month.  Several infographics have been published to try to help make sense of precisely who is suing whom.

1. Nick Bilton published the graphic below in the New York Times summarising the situation as it stood in March 2010.  The chart shows the various protagonists involved but some of the links correspond to multiple lawsuits, which isn’t represented by this graphic.

2. This was followed in October 2010 by a chart published in The Guardian.  It’s similar to Nick Bilton’s but distinguishes between lawsuits that are in progress vs. concluded.  The choice of colours is garish and kind of unnecessary as they don’t represent anything.  The layout could also be improved to reduce the number of crossed links.

3. Dave McCandless reworked The Guardian’s infographic to produce the following infographic for his Information is Beautiful blog.  Dave uses company logos which improves recognition of the warring parties, scales them according to their revenues and colours them to represent growing or shrinking incomes.  He also annotates the links with information about the lawsuits and their dollar values.

4. Design Language News tried a circular layout to improve clarity.

5. Florian Mueller focussed on the growing number of patent suits ranged against device-makers using the Android operating system.

6. Most recently Harry McCracken published the following “cheat sheet” in Technologizer.

I think Dave McCandless’ infographic is the best of the bunch as it manages to present the most information about the mobile patent wars.  Harry McCracken’s cheat sheet is a good compact representation of the situation – being a symmetric sparse matrix it could probably be compressed even further.

What’s your opinion?  Please leave a comment below.

If you spot any more such charts then please bring them to my attention.

Of Stack Flows and Lap Charts

Stack Flow of 590 Cities

I’m an avid follower of Formula 1 motor sport, so when I saw the “stack flow” visualization shown above on the Impure blog I was intrigued. The stack flow shows the 590 most populated cities sorted column-by-column according to their populations every five years between 1950 and 2010, and projected to 2025. This kind of data is similar to the lap-by-lap placings of drivers in a (F1) motor race. Shown below is the lap chart for the 2011 Chinese F1 Grand Prix.

As static displays of data both the stack flow and lap chart can be difficult to comprehend.  Did you notice Mark Webber’s amazing drive from 18th on the grid to 3rd at the chequered flag?  How about Dhaka’s rise from near the bottom of the rankings in 1950 to the world’s fourth largest city by 2025?

I didn’t think so.  I knew about the former so could look for it but was able to find the latter fairly easily because the stack flow is interactive.  Brushing a city highlights it in the visualization making it easier to see how its population rank changes over time (see examples below).  This interactive element is just what is needed to make lap charts more comprehensible.

Stack Flow for 590 Cities (Dhaka highlighted)

[ via ]

The Choreography of Sorting Algorithms

As a computer science undergraduate I spent many hours learning various sorting algorithms. Pseudo-code and static diagrams were used to illustrate the implementation and processing of these algorithms. Now (almost a quarter of a century later!) it seems algorithms are still an important part of computer science education. What’s changed is that Web 2.0 technologies are being used to aid understanding.

The video clip shown above represents the Bubble Sort algorithm realized in the style of a Hungarian folk dance. Whether this represents the best visualization of sorting algorithms to date, I’m not so sure, but it’s certainly the most bizarre I’ve seen and is strangely compelling. The clips were created at Sapientia University, Romania and choregraphed by Füzesi Albert.

Below are several other folk dances choreographed to implement various sorting algorithms.

Insert Sort

Shell Sort

Select Sort

%d bloggers like this: