The elf Imka had found a data chest that the data wizard Dorky was about to open. Imka was so curious that he was allowed to stay and watch as Dorky opened the chest.
– “Look”, exclaimed Imka, as they opened the chest. “It holds Carmix bubbles. Isn’t that amazing? These can certainly help in turning water into gold, right?“
– “Yes, absolutely”, said Dorky. “But take a closer look, they have markings on them. These are so-called Polymixic markings. Only very few wizards – if any – know how to interpret these. They indicate how each of the Carmix bubbles are associated to the others, and how to place them in the right order.”
– “Polymixic markings … I have never heard of them”, said Imka scratching his head.
– “I only know them from a tale, and I actually did not believe in their existence,” said Dorky gloomily. “Once, a wizard named Yrky worked at the castle, and he apparently knew a wizard who could sort these Polymixic markings. But he died only 523 years old.”
– “But he must have left something that can help us”, Imka cried out in a passionate voice. But the expression on Dorky’s face made him fall silent.
– “I’m sorry, little fellow”, Dorky muttered apologetically. “Without the right understanding of the logic of Polymixic markings, we can’t really do anything. But they taste deliciously”, he said, as he sank his teeth into one of them, and the bubble emitted a little fizz, as he began to chew.
A vocabulary is only good, if it is accessible and allows for the right interpretation of the data. This principle highlights the importance of using vocabularies that are common to the community and well documented, and can be referred to using persistent identifiers. Usually, you will find this type of vocabulary, taxonomy etc. within your research discipline or maybe in other disciplines, where these are developed. Evaluating a vocabulary often includes looking for its creator and checking whether it is still maintained and updated. These can be complex vocabularies; or simple mark-ups like ISO standard strings for representing countries in a data set. E.g. Denmark is DNK in ISO 3166-1 alpha-3. In terms of interoperability, this is far better than writing ‘Danmark’, ‘Denmark’, ‘Dänemark, ‘Dinamarca’ etc. in your (meta)data.