Speed Up Workshop in Milan | Visualising Ageing

DensityDesign is one of the members of the European research project ‘Electronic Maps to Assist Public Science’ (EMAPS). From the 22nd to the  24th of May we organized a workshop with nine of our best students to work with the staff for the creation of several visualizations on the population ageing topic.

Workshop coordinator was professor Paolo Ciuccarelli (DensityDesign’s Scientific Director), with PhD students Michele Mauri and Azzurra Pini. Participants were: Benedetta Signaroldi, Carlo De Gaetano, Federica Bardelli, Francesco Faggiano, Gabriele Colombo, Giulia De Amicis, Stefania Guerra, Stefano Agabio and Valerio Pellegrini.

DensityDesign workshop comes after the Science Po one, which was intended to collect data about the aging phenomenon and provided the datasets for our visualizations. They identified the most useful data sources and created a specific protocol to gather data from each of them. The data format followed data behaviours, that is each topic was described by several data formats, such as tables, networks and unstructured text.

The aim of our workshop was to provide a prompt answer to some specific questions about ageing mutation. We tried to overcome the limits of traditional visual models by experimenting different solutions, in order to find the most suitable one. Our project focused on four main topics. In order to achieve the best solutions, we invited the students to work in small groups.

You can find here all the produced visualizations, as a PDF file.

Preparatory phase

The workshop set up was different from our typical working flow. When we work on visualizations, we should define several visual levels. These are the result of a preliminary phase on frequent exchanges with the client or “domain expert”. The most difficult part is to make them aware of the real potential of data visualization, without letting them underestimate or (worse) overestimate it. Then we start working on drafts and basic visualizations in order to identify the core of communication, the most important things to convey to people. Only at this point we start working on the final visualization, identifying several visual levels (mood, colours, metaphors, typography) that fits the specific topic.

Setting up this kind of process would have been impossible without delaying the needed time. As the conditions were quite unique we worked in a different way.

Michele Mauri and Azzurra Pini were responsible of the first part. In collaboration with Donato Ricci and Tommaso Venturini from Science Po, we identified a specific question for each topic. Questions need to be focused, meaning that they should have a specific but not forced answer (there isn’t only a solution). For example, analysing the words used by websites related to ageing one of the question was “How official and unofficial websites talk about ageing?”.

Then, we created a visual format identifying standard sizes, a colour palette, and an asset of typographic fonts. The creation of a format is very important working with a large number of people, it forces everyone to “talk” the same visual language. Obviously, it can’t be too strict otherwise is impossible to express different concepts.

We prepared formats of different sizes for different purposes. Small ones , A4 and A3, are intended for a booklet fruition. Whereas a bigger one (70x70cm) is intended for a group interaction, allowing more than one person to look at it and discuss about the topic.

The workshop

Participants were divided in four teams, each of which worked on a specific topic and dataset. Selected topics are:

-       Wikipedia pages network. The network of links starting from the “Ageing” Wikipedia page.

-       Google AdWords. List of keyword suggested by Google for ageing related product, and their effectiveness.

-       Meetic. List of 2,000 profiles over 65. Dataset on how they describe their selves and how them describe their ideal partner.

-       Semantic analysis of websites. Most used word in websites related to ageing, both official and unofficial ones.

Ageing on Wikipedia

[Download]

This topic was already visualised with a network graph by Science Po. Our aim was redesigning it avoiding the “hairball” problem of the network: the high density of connections that makes it hard to read.

TEAM:

Benedetta Signaroldi, Gabriele Colombo.

DATA:

Starting from the Wikipedia page on Ageing, Science Po researchers crawled all the linked pages creating a network. We were dealing with a directed network, that is links have a specific direction from one node to another; so for each node two values have been taken in account: in-degree (number of other pages linking it) and out-degree (number of other pages it links to). Nodes have been divided in 8 custom categories, not from Wikipedia but identified by Science Po researchers.

QUESTION:

Does a Wikipedia page refer only to other pages in the same category? Are there differences between in-links and out-links?

VISUALIZATION:

Three different visualizations have been developed from this topic. The first two are variants of the same concept: avoiding the overload of connection in the network graph.

Science Po produced a first visualization of this data using a network graph, and asked to create different visualizations trying to make it more readable. In the original graph the information is conveyed only by nodes position, created with a spatialization algorithm. The coloured edges instead convey lot of noise, and little information. Due to the complexity of the relationships, most of the edges were impossible to read. For this reason we decided to aggregate them to convey more effective information. For each node has been computed the total number of incoming links (in-degree) and the total number of outgoing ones (out-degree). This is a first point very useful to gather more information about each node: which one is more connected? Is there a referential node? Is the balance of in-degree and out-degree similar in each node?

Both in-degree and out-degree are subdivided by the categories of incoming and out coming links. This information is useful to understand how each node is related to other categories: do nodes from other categories link to it? Do in-links categories are the same of out-links?

We settled that the best way to encode this information was through glyphs. Glyphs are graphical entities whose parts are visual encoding of data. A simple example of glyph is a “pie chart”: a graphical entity (the full pie) made up by several parts (the slices). This way is possible to encode data also with their positions, for example on a geographical map or (in our case) using a displacement algorithm for graphs.

We experimented two different kinds of solutions.

In the first map (table 01/A) the total degree is mapped as a ring around each node. The ring is divided into two parts, the in-degree and the out-degree through a separation line. Each part is then divided, like a pie chart, by its categories. In the second map (table 01/B) each node is represented as a double histogram. The left part represents the in-degree, divided by category, while the right side visualises the out-degree.

The third visualization (table 01/C) is an aggregated view of the network, and it considers categories instead of single nodes. A radar plot for each category shows the number of links to or from all the other categories (considering the general degree). This visualization gives an overall view on the graph. It is useful to understand how each category behaves in relation to the others. If it links only to itself, or if it is mainly linked to another one.

Ageing Advertising

[Download]

This is a series of maps showing how Google classifies keywords related to ageing for online advertising.

TEAM:

Federica Bardelli, Stefano Agabio

QUESTION:

Which are the most offered services on the web, and which are the most searched?

DATA:

The dataset is scraped from the Google AdWords keywords tool. The tool suggests related keyword to a particular query, helping to get a better placement for your product. For each suggested keyword (or group of keywords) the tool specify the competition (how many other sites are using the same keyword) and the demand (the search volume). These two values are defined in a qualitative way (high, medium, low).

61 queries divided in 10 categories related to aging have been tested this way. This produced about 4300 related keywords groups, and a total of 380 single keywords.

VISUALIZATION:

Two different visual models have been used for this topic.

The first one has been used to design a series of similar visualization on different scale. This is due to the high number of single records in dataset. We found that the best way to represent this type of data was the bubble chart because it helps to visualise correlations between dimensions. On the horizontal axis we placed the competition value, on the vertical axis the search volume. The size represents the number of suggestions.

In the first visualization of this series (table 2A) data is aggregated by category. Tables from 2B to 2M visualize in the same way all the single queries exploding each category.

A greatest problem was to visualize data at minimum scale, meaning the single suggestion. As each keyword can have only three possible values (high-medium-low) visualising them using a bubble chart would create an overlapping (table 2N). A possible solution resulted to visualise for each query the number of results in the 9 different areas (table 2O). This visualization gives a deeper insight, showing how suggested words place in the market. Table 2Q visualizes all the queries with the aforementioned visual model.

The second visualization (table 2R) visualizes the network of keywords for each market area. The aim is to find similarities and differences between them.

Ageing Talking

[Download]

These visualizations show the most popular words used on websites to talk about ageing.

TEAM:

Carlo De Gaetano, Giulia De Amicis, Francesco Faggiano

QUESTION:

Do official and unofficial sources describe aging with different words?

DATA:

A list of websites have been analysed extracting the most relevant words in an automatic way. Each site has been listed as official or unofficial. Each site is also targeted on its nature (commercial, non-profit, institutions…) and on its technical nature (blog, forum…). For each word, depending on the proportion between official and unofficial sources linked, it was created an index (or degree) of “officiality”. For each word it is also computed the number of sources using it.

VISUALIZATION:

The first group of visualization (tables 3A, 3B) shows the original graph. In the first one (3A) the name of the websites are highlighted and coloured according to their typology (official and unofficial).

The second one represents the same network but highlighting the keywords, and colouring them on the proportion of official/unofficial sites linking them.

These two visualizations are a visual explanation of the process: all the following tables won’t show websites, focusing on keywords.

Chart 3C shows all the keywords polarised by their level of officiality. Keywords are displaced from left to right according to the number of quotes from official sites.

In chart 3D are showed once more all the keywords, but polarized on the typology of the kind of website using them. The outer ring shows websites categories, arcs’ partition is proportional to the number of website in each category. Words are placed near the category that uses it the most. Keywords in the middle are equally used by all categories. Colour represents the officiality index. This table helps us to understand which words are mostly used by a category of websites, and if a category attract official or non-official words.

Table 3E uses the same visual model to visualise keyword distribution among a classification based on the website technical nature.

Ageing on date

[Download]

These visualizations show how elder people describe their selves on a popular dating website (MEETIC) and how they describe their desires.

TEAM:

Valerio Pellegrini, Stefania Guerra

QUESTION:

How elder people describe themselves for a possible date? How do they describe their wishes? Is there a matching between available characteristics what people seek?

DATA:

2,000 profiles of elder people (65+) have been taken from the dating website meetic. These profiles are proportionally divided in four categories: men looking for women, men looking for men, women looking for men, women looking for women.

For each profile have been taken 22 main characteristics (e.g. relationship, nationality, entertainments…) and 16 additional ones (e.g. sports, my pets, political views…).

For each personal profile has been also taken the description of the man/woman the user is looking for. This description is described in 37 characteristics.

VISUALIZATION:

Two first visualizations describe which answers are more used by overall profiles. Chart 4A shows how elder people describe themselves. Each circle represents an answer, while the colour represents the question. By looking at this chart it is possible to understand which are the most used characteristics. Chart 4B uses the same visual model to show how well is described the ideal partner.

Chart from 4C to 4F show similarity and differences between the four typologies of profile. In each chart it is shown the most frequent answer for each question. Charts from 4G to 4L show how the ideal profile is described.

Tags:

Leave a Reply