Date:
13 June 2018
Author:
Alfred Deeb

Open data policy

The Australian Department of Prime Minister and CabinetExternal Link (PM&C) directs government entities to “make non-sensitive data open by default”. Similarly, states also have their own policies and initiatives around open data and the importance of making relevant government data accessible and reusable.

Putting this policy into practice requires a framework of standards and procedures to see the data through its lifecycle and ensure roles and responsibilities are clearly defined. Below we cover the main roles, data quality and some FAQs.

Open data governance

Data governance sets up the policies and procedures that need to be followed. One example is the DataVic Access PolicyExternal Link and the DataVic Access Policy Guidelines, which cover the process of making data available and some of the key accountabilities.

Overview of the main roles

There are generally three main roles within open data governance. They are:

  1. Data owners
  2. Data custodians
  3. Data stewards

However, different organisations may operate different models. For example, in Victoria, there’s usually an agency owner (with an accountable officer from that agency) and a custodian.

Data owners

The data owner is the person/organisation that ‘owns’ the data. In research fields (including government research) this is usually the agency/organisation that commissioned the research and paid for the data collection.

In more general government settings, the data owner is the department that collected and/or generated the data.

The data owner is the individual who’s ultimately accountable for decisions, but they’re expected to delegate responsibilities where appropriate.

Main responsibilities:

  • Develop policies, protocols and guidelines in relation to the information asset, process and/or system

  • Approve the classification of information assets to ensure integrity

  • Ensure data collections are adequately protected by implementing the relevant security controls

  • Approve significant changes to the data collection, process and or system

  • Delegate responsibilities for decisions and tasks to custodians.

Looking at data.vic, an example of a dataset might be crash statisticsExternal Link . In this case, the data owner is VicRoads.

Data custodians

The data custodian is generally the person (or agency) who is responsible for the data and managing the data’s lifecycle. The custodian is generally NOT the owner of the data, although in the government context the custodian may be a person within the agency that does own the data.

SimplicableExternal Link defines a data custodian as the person responsible for the technical elements of the data, such as security, availability, accuracy, backup and restore, and technical standards. WikipediaExternal Link also talks about custodians being responsible for the “technical environment”.

The DataVic Access Policy lists specific responsibilities and activities of the custodian:

  • “maintaining a data asset register

  • identifying and coordinating datasets to be made available

  • defining the appropriate data quality statements

  • uploading datasets to the data directory

  • managing dataset suggestions and feedback via the data directory

  • reporting the progress of making datasets available to the IMGC and DataVic Access Policy team.”

Government agencies often also hire third parties to help them with the custodian responsibilities, such as setting up and hosting a data platform. For example, Salsa Digital works with the Victorian Department of Premier and Cabinet on data.vic.gov.auExternal Link .

In terms of data custodians, it should also be noted that the the Productivity Commission's detailed report Data Availability and UseExternal Link suggests appointing a National Data Custodian (NDC) to oversee all data use in Australia.

Data stewards

Data stewards are responsible for the datasets from a business perspective. According to SimplicableExternal Link , a data steward ensures the data supports business and regulation requirements.

In reality, there seems to be a lot of crossover within the roles in Australia and often there is no official ‘steward’ in the data process.

Working together

For open data to work well, the custodians and owners (and stewards if applicable) need to work together to guide the process from initial data collection to maintaining the open data on the relevant registries (e.g. state and federal data portals).

At the government level, often the specific department/agency is the owner of the dataset, with one person in the department nominated as the accountable officer. They delegate responsibilities for decisions and tasks to custodians. Then a custodian manages the data (according to delegation from the accountable officer), uploads datasets, monitors and reports on data, implements changes and engages with users to determine needs.

Data quality

One of the key elements of opening data, is to open it in a way that the data can be easily used, i.e. data should be open and usable. For example, while information that’s publicly available via PDFs is officially ‘open’ (because it’s publicly available), it’s not in a very usable form. Ideally, data should be machine-readable as Application Programming Interfaces (APIs) — in fact this is one of the recommendations in the Australian Government Public Data Policy StatementExternal Link .

Data should also be valid, accurate, complete, consistent, current and uniform.

FAQs

Who cleanses the data?

Data cleansing refers to the process of finding and removing data errors and ensuring the final dataset is of a high quality. Ensuring ‘clean’ data is generally the responsibility of the data custodian, however this process is often outsourced to an external vendor. We’ve done a lot of data cleansing at Salsa!

Who’s responsible for collecting the data?

The data owner is generally responsible for collecting data. For example, in the case of the Bureau of Meteorology, they would collect and own weather data such as daily temperatures, rainfall, etc.

Who’s responsible for data accuracy?

Data accuracy can be improved throughout the data lifecycle. Better collection processes will ensure the raw data is as accurate as possible. This is the responsibility of the data owner. However, data custodians are also responsible for data accuracy, given they’re the ones who look after the technical side of things.

Who’s responsible for data currency?

Generally the custodian is responsible for seeing the dataset through its lifecycle after it’s created/collected, including ensuring open/public datasets are the most current ones available.

Who’s responsible for data frequency (updates)?

Deciding on the frequency of updates for data can fall under the data owner’s jurisdiction based on when they’ll collect the data or the data steward may dictate frequency if any legislative or policy requirements apply to the dataset. For example, the data owner may decide to update the data every two years, or if policy/legislation dictates data is updated yearly the steward needs to work with the data owner to ensure this is done.

Who’s responsible for data security, hosting, audit-ability, etc.?

Data security, hosting, auditability and other technical requirements fall under the jurisdiction of the data custodian.

Get the latest digital insights and Salsa news

For a roundup of the latest news and insights across digital government, web development, open data and open source please subscribe to Salsa's monthly newsletter. 

Subscribe to our newsletter