People working with data knows the importance of metadata for its usefulness. Metadata are needed to describe characteristics of data, such as whom and when it was created. Still there are data that are published without sufficient description of its characteristics, which will limit the use and affect user trust. It seems that owners and providers of open data are not taking the usability perspective and focusing more on publish data regardless if metadata is complete or insufficient.
To make an analogy of the importance of metadata, imagine that you are interested in a car parked outside a car salesman, but it lacks information regarding price, mileage, year, petrol / diesel, and other important facts about the car’s condition. The absences of facts about the car’s features will probably affect your curiosity since it does not seems serious.
Open data is a digital resource that can be copied and used by anyone, but for it to be useful, there must be facts about the characteristics of data. Such as, whom or what created it, what standard and reference systems are beings used, etcetera. This sounds obvious, but a quick examination of a number of open data sources show that there are obvious flaws and shortcomings concerning metadata. In some cases internal data has being published immediately without complimenting metadata. This entails that not enough focus has been devoted to metadata for existing data sources. There is a need for data owners and providers to be more meticulous about metadata data before publication since it will limits its use and machine readability. The choice of data formats also have a major impact on how well metadata can be specified. But data formats with good support for metadata is of no use unless the owner or provider ensures to minimize inaccuracies and flaws before publishing.
The table shows common data format and their metadata capabilities.
Format | Metadata | Description |
---|---|---|
ZIP (compressed file) | None | No support for metadata |
CSV (comma-separated file) | None | No support for metadata except that first row can contain name of column |
Limited | Metadata about creator and date. | |
Spreadsheet (Excel) | Limited | Metadata about the creator, date, data types. To extract metadata a special programs or module is needed. Metadata is not a natural and it is a proprietary format. |
JPG, PNG | Fair | Metadata about the creator, date, license regulations, geographic location, and the camera settings and more |
JSON, XML | Excellent | Allows defining metadata structures which describes, owner, date, time zones, complex data types and validation of appropriate values. The formats allow linking metadata schemas that are included in a hierarchies of objects which belongs to a taxonomy |
Icon made by Freepik from www.flaticon.com, licensed by CC 3.0 BY