Reusable: (Meta)data meet domain-relevant community standards

The tale

Rebzuss was the last elf returning with a data chest. She was glowing with expectation and pride, because she had found a data chest containing a spell for turning water into gold.

The data wizard Fixeor Datahin looked at the chest label:

– “Well done, Rebzuss. This is exactly what we are looking for”. He opened the chest and started frowning:

– “Hmm … I can see that this is the right spell, but I can’t quite understand it. It looks like the spell is written by a data wizard trained at the data lab at Oxwart University. I recognize the peculiar use of logical symbols and the Oxwartish way of data handling in procedure 1 and 5. It will take us years to translate this spell into Scruby.”

Rebzuss looked sad – that was not the reaction she had hoped for. Suddenly, her face lit up and she said:

– “Overwizard Fixeor, why don’t we call upon the witch Lux Datastorm. Before working at the castle in Datamania, she studied at the data lab at Oxwart. It might be easy for her to translate the spell for us.”

The overwizard immediately summoned Lux Datastorm and showed her the data chest and its contents. She laughed: “I can see why you have difficulty understanding the spell. This is described just the way they do spells at Oxwart. Give me an afternoon and I will translate it into Scruby and our Datamanish procedures.”

Lux took the data chest and disappeared into her chamber.

The truth

Working with data sets from a variety of resources is much simpler, if everybody agrees to a certain standard way of organising and describing the data. That is why many disciplines have created metadata standards for describing data, and created lists of recommended file formats etc. Keeping in line with these standards will lead new data out into the ecosystem of data that is easy and suitable for others to reuse. Therefore, you should always try to be on the lookout for standards within your community and try to adhere to these. However, not everything can be standardized, of course, and many research disciplines are breaking new ground where there are currently no standards – and then you will turn to more generic standards, or begin inventing new ones.

Notice that standard in this sense is not a quality measurement indicating a level of high or low quality of the (meta)data. It should always be judged by the people who are re-using the data in their specific contexts.