MyData Dictionary

The MyData Dictionary by MyData Operators group is licensed under Attribution-NoDerivatives 4.0 International

The MyData Dictionary is an articulation of the key data fields an individual would wish/ expect to see in a data-set/ database that is designed to empower them, and which they control. The MyData Dictionary is looking and built from the perspective of the individual, not from that of organisations.

The MyData Dictionary is an articulation of the key data fields an individual would wish/ expect to see in a data-set/ database that is designed to empower them, and which they control. The MyData Dictionary is looking and built from the perspective of the individual, not from that of organisations.

To bring that to life. There are some 7.8 billion humans on our planet; subject to some known and very specific anomalies, all have:

  • A name(s)
  • A gender
  • A date, time and place of birth
  • Biological Parents (with names, times and places of birth)
  • A home/ contact address

A sub-set of the above will have some very common add-ons:

  • Phone number(s)
  • Employer(s)

This list of fields does not claim or wish to be based on any one particular technology or existing standard. A first release is at this link in JSON-LD format. Clearly there will be anomalies to take into account but we have built for the simpler majority at this stage.

The core data dictionary will be useful in the following scenarios:

  • As an individual, I might expect the many online forms that I need to fill in could become informed of and work with the core human schema. Therefore, if the services the individual engages with utilise the core schema, form filling can become more automated and less onerous.
  • As an individual, I want to be able to pick up my data from one personal data service (locker etc.) and drop it into another with minimal fuss. Therefore, if these services share, or at least understand (can map to and from) a commonly understood reference point then that will happen more easily.
  • As an individual, I want to be able to have applications run on my personal data, even when it exists in multiple different services. Therefore, if these applications can all utilise a common, shared reference point, or at least map to and from such a thing, then that will happen more easily.
  • As an individual, I want the apps I had running on my personal data in one locker service, to work when I move my data to another one. Therefore, if these applications share, or at least understand (can map to and from) a commonly understood reference point then that will happen more easily.
  • As an application developer, I want the apps I build to run, with minimal overhead, across multiple personal data services. Therefore, I can build with the assumption that these data services will share a common model, or at least a common reference I can map to and from, then that will work more easily.
  • As an organisation willing to receive and respond to MyData style data feeds; I don’t want to have to set up different mechanisms for each different MyData operator. Therefore, I can confidently build the API’s I will need to help me co-manage data with my customers.
  • As an organisation willing to provide/ return data to customers as part of our proposition to them; I want to be able to make this data available in the minimum number of ways that meet the users needs, not a different format for each personal data service provider. Therefore, I can build my data portability mechanisms in the knowledge that they will work for many different ways in which my customers may wish to port their data.
  • As an organisation willing to provide/ return data to customers as part of our proposition to them; I don’t want to have a different button/ connection mechanism for each personal data service provider on the ‘customers signed in’ page of my web site. Therefore, I can use some standardisable components to support subject access and data portability requests from my customers.
  • As a new MyData Operator, I don’t want to have to re-invent the core database design from scratch; i’d like to build to an agreed model. Therefore, I can get up and running more quickly, and have a degree of interoperability from day one.

It is worth noting that the list of standardisable fields is only a start point; there are many hundreds of additional data attributes for which the individual is undoubtedly the best originator. We will aim to update and extend the MyData Dictionary monthly. We will maintain release numbering, this initial version is 1.0.

Multiple language support will evolve in parallel.

Questions/ Suggestions

Any questions or suggestions for future interactions of the MyData Dictionary should be sent to: operators@mydata.org 

Notes

Note 1: As human-centric data is, by definition, life-long, a general principle applied would be that each entry would have a validity period as metadata. That is to say ‘my employer’ has a start date and potentially an end data. A related point is that, by definition, an individual can have many variants of the same field (e.g. my email address), each with a validity period, and each ideally with tags to help describe/ further delineate.

Note 2: Field definition and descriptions must be precise and carry sufficient detail and pre-work to be unambiguous; i.e. be well defined. 

Note 3: at this stage, we are not dealing with derived or concatenated fields; this should be the raw data, low-level individuals fields (so ‘address lines individually rather than combined, and no derivations such as ‘age’ from ‘birth date’.