Category Archives: Liveblog

Getting the open data you need for good Neighbourhood Planning

Neighbourhood plans are a crucial part of the UK’s planning infrastructure, allowing people to have a serious say in the development of their own area. People in Bramcote decided to take advantage of this – the move to do a neighbourhood plan was driven by a desire to preserve the green belt in the area.

They decided to work on Bramcote ward – a political ward – for simplicity’s sake.

Getting open data for neighbourhood plans

Judith’s first step in building the maps and plans needed for the plan was working out what’s there already. She sought open data that showed what existed within the ward, from walks to infrastructure to the areas of green belt. Local wildlife sites were easily defined – the shapes were downloaded from data.gov.uk, but some local sites weren’t there. They were found at Nottingham Insight mapping, but it wasn’t downloadable. A printout isn’t super-useful for GIS work – and the data wasn’t released for anything but personal use. And the data owners wouldn’t allow permission.

Greenbelt boundaries have been published, so they could see how they’ve been changed. But consultation on planning shapefiles weren’t available for use.

Why the copyright restrictions on public data?

Where did the constraints come from? Possibly a local supplier added copyright out of information extracted from Ordnance Survey. But the OSNI does enforce copyright data on anything that uses its data. There’s the option to buy the data of course, but that’s expensive.

In the end, self-generating maps is one solution. The other is that you can go and talk programmatically to the services underlying the OS data. Of course, just because the data is available that way, doesn’t mean its necessarily open data.

It’s the copy & paste paradox. Just because you can copy it doesn’t mean you can paste it.

The general conclusion was that the restrictions were almost certainly more cultural than anything else. But how do you deal with that? Campaigning – people need to be aware that these things should be available. And many times, realising the data isn’t anywhere near central to the officer’s job or priorities.

Solving the problem in a neighbourly way

How about co-operation with other neighbourhoods creating their plans? That brings a weight of political pressure to bear. Another way to exert pressure is to put a request in, mentioning that you’ll use a Freedom of Information request to get it instead. That tends to shake it loose.

Useful tip: there’s an active QGIS user group who are working in neighbourhood planning. That’s useful network of expertise and experience.

Turning the question on its head: are they releasing their neighbourhood plan polygons? The local planning officer would probably leap at that, and you’re setting things up for an exchange.

Open Street Map? It was tried, but there were problems registering it.
OS the GridInQuest software could be useful for dealing with that.

Session Notes

Belfast’s Low Power Wide Area Network: how to use it?

Led by: Mark McCann: smart technology team, Belfast council.

Background: There was a competition for a low power wide area network outside London, which has a LP WAN already.

A consortium led by Ulster University won the competition and will pay for a LP WAN that can be used by universities and companies for research. Councils have provided pots of money to address challenges in the city that an LP WAN might address.

NB: A low power network can be used for small amounts of data, intermittently. So, 4G connects all the time; but this uses up power very quickly. Low power networks allow, for example, sensors to transmit small amounts of data at set times, so they retain their power much longer.

Low Power Wide Area Network 1

Care and feeding of the LP-WAN

Questions: What could the network be used for in the city? And what could the data that is generated from such projects be used for, taking account of privacy issues, and other datasets that might be put alongside it.

An example (from a session participant): Dublin wanted to know about pollution at a very granular level. They decided to deploy sensors to pick up on that, so they could make changes, to the bus lanes or whatever.

Ideas (Mark McCann): There are three use cases we have in mind. One is transport: if we could deploy sensors on bicycles, or pedestrians, then we could find out where people go. Another is tourism: we know people come to Belfast and they leave, but we don’t know much about where they go.

And the other idea is logistics: lorries, and the supply chain for retail. Belfast has a sea-port that is a hub for the rest of the country, so we could deploy sensors to find out where things are going. But we would like to run a citizen challenge, to open up the data to people to run projects.

Getting practical

Practical issues (participants): An issue in one area that tried this was that the companies refused to take the sensors; so that is an issue. Planning permission can be a problem: even though the council is behind this, it may need planning permission.
Another city that wanted to do this wanted to put sensors on street lights, but then discovered it didn’t own the lights anymore, so it had to put up poles. Also, you have to be able to maintain and calibrate sensors. What do you do with low quality data?

Privacy: With data that involves people, there is also a real issue with privacy: with the general data protection act coming in, you have to be aware of the basis for collecting information, and to think about how you are going to be able to release it. There is IoT information on the IOC website.

Dealing with results (participants). Cities like Cambridge that have run bike projects have found that tensions can arise when the data reveals how bikes are used. Are they used by visitors or residents? That can change the contracting basis on which facilities are provided, so you need to be ready for challenge on that.

Similarly, traffic and pollution sensors often reveal problems around areas like schools, as people drop children off. How are you going to handle that?

Getting the right projects (participants): You need to get companies, but also arts groups and populations involved. Sometimes, cities do data and IoT projects, and it seems great, but nobody actually uses it. Sensors are very, very cheap. You can go to communities to find out what they want.

Changing policy: Cities that have put sensors on libraries have found that people go in for the day, and that’s because they are looking for community – they don’t want to borrow a book. So you have to be ready for projects like this to change things.
 

Open Data Horror Stories: 2017 Edition

There’s a tendency to focus on personal data as the major risk of open data. But there has to be more than that.

Open Data: The Horror – by Drawnalism's Matthew Buck

ODI Devon has made a policy of holding its meetings around the county. This avoids everything becoming Exeter-centric, but there is a cost to hiring the meeting rooms, and as they publish their spending as open data, it’s led to some criticism.

There’s lots of work going on around databases of GPs. That could be used for ranking GPs, on a simple scale. That could be too simplistic. And there’s not really consumer choice in GPs – so how useful would that be? Could you end up with property price issues as you do with schools.

Fun fact: there are no such thing as school catchments, there’s only derived areas when the school is over-subscribed…

Trafford has a selective education system, with an exam splitting pupils between grammar and high schools. The net result? The grammars are full of children whose parents can afford tutors. So, people started looking at the ward by ward data, to move the discussion beyond anecdote, through use of a visualisation people could explore. The Labour councillors could see that their wards were being discriminated against in favour of people from outside Trafford – but then nothing really happened.

Data does not come with intent. But it can then enable dynamics which lead to inequality or gaming the system. Is it right, ethically, to withhold the data because of that? The instinct seems to be “no” – but the system needs to be looked at.

Personal data problems

If we cock up and release personal data – that’s on us. It’s not the fault of the open data system. It’s good that people examine how we spend money – because it’s their money! But be a dialogue, not a broadcast – let them come back and discuss what they find in the data.

Does open data make accidental personal data releases more likely?

Well, possibly, if you put deadline and statutory pressure on people, without the resources and expertise to do it well.

Matching data sets is one concern: where you can de-anonymise data by matching sets together. It’s very complex to deal with. You don’t only have to think about your own data, but also be aware of what else is out there. That’s challenging. Pollution data is very easily connected with location and individual farms, for example. The converse risk is aggregating it upwards until it become meaningless.

There’s also the risk of releasing data that harms people economically.

Analysing the extent of risk

Astrophysics is rarely front-page news. Medical research is. Medical researchers can’t self-publish. In physics you can self-publish. Open data needs this – a sense of the potential damage a dataset can. For some it will be negligible, for some it will be serious.

There are two dimensions worth considering:

  • Likelihood of risk, from unlikely to almost certain
  • Severity of risk, from minor boo boo to full-scale zombie outbreak

At some places, no data is released until it’s analysed through that process. However, it assumes that you have experts that have the knowledge to do it well. You also have issues of impartiality – repetitional risk shouldn’t be a factor, but it will be for some organisations. Innate bias, political, racial or sexual could influence the person making the decisions or scoring.

How do you balance this against the opportunity cost of NOT releasing the data?

There are a small number of high wall reservoirs that are at high risk for catastrophic damage if they fail. The government won’t release which they are, because they could become terrorist targets, but equally, the people who live in the area at risk have no idea and can’t prepare.

Session Notes

Can Open Data help Northern Ireland bring down its interfaces?

The interface team in Northern Ireland is tasked with dealing with the peace walls – Interfaces – which separate Protestant and Catholic areas of Belfast and elsewhere – which are due to come down by 2023. The program has a Twitter account and Facebook accounts to increase engagement with individuals and communities concerned.

Cupar Way is the largest of the interface structures.

In order to get them down, then government has committed to only removing them with the consent of the involved communities – but actually reaching this point present significant challenges. And some of these areas are the most deprived in Northern Ireland.

The data accuracy problem

They have some data, but it’s not open yet. They’re developing mapping data, and have existing data on crime, health, bonfires and so on. Could there be an open data platform to bring this all together? There are some data sharing agreements with the various sources of data – and there are some problems surfacing because in some places the data sources aren’t accurate (or detailed) enough. That needs to be solved before it’s opened, because of the sensitivities involved.

It’s clearly very important to get this right. They need to the best possible information before they can make decisions if the walls are safe to come down.

Academia must have useful data for this process. Is there some? How would they get hold of it?

How can they ensure that the general public engage with the data? A portal would be ideal – but they’re a long way from that. There’s a lack of technical expertise in the team, but there’s a lot of interest that needs transforming into resources and actual help. They’re more than keen to add new people to the “dream team” behind it.

The definition problem

There’s some contention about the number and length of interfaces out there. How do you define communities for consultation purposes? Residents? Businesses? Churches?

Once you’ve done that, how do you consult?

Academia have been doing some interesting work mapping religion and communities around the walls – some communities live right up to them, some don’t as the nearby houses are now gone.

There’s some debate about what is an interface or not. The DoJ is responsible for 59 structures, and they have been reduced to 49 to date. There is a physical map of interfaces – but it’s not owned by them. They have their own data – which they would like to publish. Something as simple as GPS co-ordinates linked as walls could serve. Postcodes are not open in Northern Ireland, which doesn’t help.

A need for more informed consultation

The Interface team engages with people day to day in a grassroots manner. But they’d like more data on service duplication, travel time increases and so on, that could help persuade communities that they’d be better off without the interfaces. It will help them understand the benefits and impact.

Current responses form these communities are genuinely mixed. They’ve been building up their engagement over the last year, but they’re still not reaching the local residents enough. There are issues of power and control over the communities to deal with.

One attendee pointed out that data shows that many communities either side of an interface are identically in terms of economic, health and crime data. The only difference is religion. Can that data be used to help reconcile people?

There is trans-generational trauma at work in some communities, which makes just testing opening up doors in the interface problematic. They can’t just go in with sledgehammers – you need to bring the communities along with the idea. Tech’s A/B testing doesn’t normally lead to petrol bombs…

In summary

They need assistance. Anyone who can get the data portal idea to move forwards, or who has ideas should get in contact.

The EveryPolitician project

Open Data Camp 5 is taking place in the Computer Science Department of Queen’s University. It’s a modern institituion that wants to make sure its students are ready for work.

So there are rooms that are carpeted with artificial turf, filled with trees, and furnished with garden benches. Of course there are.

The second session of the morning gathered in the garden room to discuss the EveryPolitician project [everypolitician.org], a bid to collect information about every politician in the world, anywhere.

Not an easy job…

The EveryPolitician site says that it has information on 74,939 politicians from 233 countries – and counting. However, Tony Bowden from mySociety, who started the project with Lucy Chambers, explained that this has been very difficult to collect.

The project started by “running a lot of scrapers on a lot of sites” but, because of licensing issues, “we couldn’t quite tell anybody they could use it” and “we weren’t sure it was sustainable.”

So, the project is moving to Wikidata, in the hope that this will become “the canonical source of information about politicians”.

Why Wikidata?

Bowden explained that Wikidata is connected to Wikipedia. There is no single Wikipedia; there is one for every country. So, Wikidata collects information from all the different pages, in a structured or semi-structured form, because otherwise they get out of sync with each other.

On the political front, for instance, Bowden said that if there is an election in an African country, local people will update that quite quickly, but the page in Welsh might not be updated for some time. The idea of Wikidata is to keep them aligned.

Still, collecting information “in a structured or semi-structured form” is not as easy as it sounds. For example, a session participant asked how EveryPolitician defined a political party, given that the idea will be fluid across different political systems.

Bowden acknowledged that, for EveryPolitician: “We came up with a simplified view of the world. We felt it was more important to have something good enough for every country to use, than to capture all the nuances.

“If you want to do comparative analysis across the world, you can’t start with a two-year anaysis of systems. It’s ok to say there will be some kind of political grouping.”

An evolving project

Bowden added that he thought these kinds of issues would be resolved, as people started to use the data. “We think it will be something like OpenStreetMap,” he said. “So, initially, there will be some broad concepts, but as it goes along, there will be people who go along and do every tree in their area, and the nuance will start to come through.”

Another issue is that there may not be a ‘single source of truth’ about politicians in some countries. For example, the Electoral Office of Northern Ireland knows who has been elected to a seat, but may not log changes  – for example, if someone stands down and somebody else is co-opted.

Or there might be official sites, but they might not be very good. Kenya only lists politicians with names starting with the initials a-m (and one person starting p). Nigeria’s politician pages are up to date, but its index is three months behind.

Bowden said EveryPolitician is building tools so that individuals can scrape official sites, and then upload the information to Wikidata, and fill in gaps or correct errors.

What is interesting, he said, is that once a country gets a good list, with people committed to maintaining it, that information tends to be much better – and better used – than official sites.

“If there is an election, the Parliament site might not update until Parliament sits, but the Wikidata pages will update overnight,”Bowden explained. “Then journalists can mine that for stories. So they can instantly tell people things like: ‘the youngest ever MP for Wales has just been elected’.”

Find out more:

The EveryPolitician website has collected 2.8 million pieces of information so far. Tony Bowden has explained the move to Wikipedia and its benefits in a blog post on the MySociety Website. He also Tweets about the EveryPolitician project. 

Open Data for Newbies (2017 edition)

It’s OK to accept that bright, engaged people might not know what Open Data is. So, here’s a beginner’s guide for them, liveblogged at Open Data Camp 5 in Belfast.

What is open data?

It’s a data set that anyone can access, and which has a licence, and which has been published. Primarily, it’s in a machine-readable format.

Data, in this context, is anything! A photo can be open data. Generally, we’re talking Data that can be presented in rows and columns in a CSV (comma separated variables) file. It’s an open format akin to (and compatible) with Excel, but which isn’t dependent on owning Microsoft office.

Data is the first stage – it’s just data, not yet information.

Where is it published?

On a website – in a way that you can easily access, ideally without limitation or need to register. You shouldn’t have to pay.

There are various platforms and portals that make open data available.

What about the licences?

It should be published with an open data licence attached. The licence can tells you what you can do with it, and hinder what conditions (like attribution, for example). The Open Government Licence (OGL) is one example. Creative Commons is another one.

Data without a licence isn’t really open, because you don’t know how you can use it.

Technically,if you abuse the licence, you can be cut off from using it – but that’s hard to enforce.

Do you need an ethics licence?

Open data should never be personal.

There’s a data spectrum between closed, private data, via shared data (which is available to a subset of people), and then there’s public data (like Twitter’s feed, for example), and finally open data.

Open Data is data that is free for anyone to access or share. Even if it is derived from personally-identifiable data, that data should be anonymised.

What is metadata?

Metadata is data about data. It’s information like the source of the data, or how it was collected. Sometimes the metadata is great, but sometimes it doesn’t exist. Metadata is where you can give your data context.

What is the ODI?

The Open Data Institute was founded five years ago as a charity to connect and inspire people around the world to use open data.

ODI Nodes are local groups of open data enthusiasts and advocates. They’re a bit like a franchise. There’s no trickle down funding, so the nodes have to raise their own funds and use volunteers.

What is an API?

An API is an application programming interface. It allows you to automate extraction of data from a data source, via coding. Basically, someone who owns data on a server has written some code that allows you to access that data. APIs are more interesting for realtime data – which is constantly changing. TfL publishing’s loads of realtime transport data about London via APIs. The CityMapper app uses that API.

In Bristol there are air quality monitors that report every 24 hours via an API.

It’s a way of automating updated data access.

What is Linked Data?

Linked data is a data point that a computer can read, that allows referencing fo data. So, if you’re citing data in a paper, you can provide a hyperlink to the original data so people can check the provenance.

What are the five stars?

These were determined by Sir Tim Berners-Lee

  1. Make it open
  2. Make it machine readable (tabular data in a spreadsheet, for example)
  3. Same as above, but in a non-proprietary format.
  4. Using an URI – uniform resource identifier
  5. Linking your data to other data.

More info on 5 star open data. Generally speaking, three star is good enough.

What are registers?

These are definitive data sets. The Government Digital Service are building these for some key pieces of information such as the definitive list of countries in the world.

Do we have data standards?

A standard is everyone agreeing to do something in the same way. We don’t have a definitive list of standards for open data. They make things much easier – but are hard to agree and enforce. Standards make it much easier for machines tho read data and connect different data sets. Humans make this worse by having preferences. There are a growing body of code snippets that allow data to be transformed into a preferred format, if it wasn’t supplied at that way to start.

There are standards bodies which think very hard about standards, agree them and publicise them. W3C, ISO and so on. It’s hard to enforce, but you can persuade.

Open Data GP Registers

Northern Ireland has always needed to keep registers of GPs and other health providers. Now, at least some people in its government and health and social care service looking to release the GP register as open data.

A single list of GPs in Northern Ireland that is available in machine readable format.

Why? Session leader Steven Barry explained: “Lots of government departments have lots of service information, but it is often collected manually, so when somebody leaves it stops, or people do it differently.

“Working in the health service, our statistics guys were spending a lot of time presenting data instead of putting it out in a standard profile and letting the community do things with it.

“Simon Hampton, our minister of finance, is a very young guy, and he was very interested in what is happening in Estonia. So he has been right behind this.”

Technical challenges

There are technical challenges with creating registers. Most obviously, how are they going to be populated? If there is a register of practices and a register of GPs, how are they going to be aligned?

And if there is more information, such as how big a population the practice serves, how is this going to be kept up to date?

One session participant said that in England and Wales, at least, there is no agreement on what a GP is. But at least six organisations hold lists of GPs. Which have different categories of information within them.

Barry’s register tackles some of these issues. But having seen it, some session participants pointed out that it still has limitations. There are identifiers, for example, but no information about what they are or who assigns them or how they can be used.

And the register is only available in a limited number of formats. One participant pointed out, though, that the technical problems can be solved. What is needed is people who want to solve them. Projects need stakeholders and users.

Finding uses and users

So, with a register in place, what might people be able to do with it? “In England, NHS Digital has been allowing people to rate their GPs,” Barry said.

“So if you use Bing, you can search for a GP, and see practices by rating. I looked at the GPs in Paddington and they were all two stars. I thought that must be a bit demoralising for the GP.

“And what I wanted to know was things like how many people does the GP look after, are they male or female, when and where were they trained, and what services do they offer?”

UK governments all want to give people more choice of GP, so other people are going to want the same information. Even though finding a GP with an open list can be a whole new challenge…

And health and care services have been investing in chief clinical information officers; doctors and other health professionals with an interest in IT and data. So they might be advocates for this kind of development. There are champions for change out there, the session concluded. The trick will be finding and using them.

Open Data Camp 5: the pitches

A very warm (and windy) welcome from Belfast, where we’re just getting underway with Open Data Camp 5. The pitches are about to begin…

Pauline Roche is talking us through the process. There are a lot of unconference first-timers here. But, now they’re briefed, let the pitching begin

  • Open Data Horror stories – to allow policy development – and risk assessment.

  • Are you the missing (open data) link? What’s your role – who depends on you, and whom do you depend on?
  • Volunteering as a data scientist for charity – what’s the supply chain of data?
  • Open registers – and connecting them to GPs
  • Every Politician – why WikiData is the future of it.
  • Open data impact: measurement, demand drives, supporting bodies and standardisation.
  • What is the least effort to make our open data services better?
  • Dogfooding for revenue. It’s not just something we stick out, but something we use.
  • Open data for newbies
  • What does the Government Digital Service need to do to engage better?

  • Using open data for creating neighbourhood plans
  • The Government should be testing the value of open data by releasing some sets – would it work?
  • What does the very best open statistic look like?

  • Identifiers for organisations and people.
  • Public wifi data – what could people do with it?
  • What data set are you working on that isn’t getting enough coverage in the press?
  • Open data for fun and education

  • How can a data geek help you?
  • Different approaches to validation.
  • People who collect open data – crowdsourcing. (Open Streetmap, Open Plaques, Open Benches)
  • Open Data Maturity Framework – learn more or contribute
  • Efficiency gains in Government through open data.
  • What could you use LPWAN for? (Low power, wide area network – connecting Internet of Things devices to the cloud)
  • How do people want their local authority data delivered – and what data do you want?
  • Using open data to bring divided communities together.
  • Open data platform fight camp!

  • How could charities make more use of open data?
  • Bring open data into game/VR environments

And… we’re done.

What makes for a good API?

One of the first questions to come up on day two of Open Data Camp was “what is an API?” One of the last issues to be discussed was “what makes a good API?”

 

Participants were asked for examples of application programming interfaces that they actually liked. The official postcode release site got a thumbs up: “It was really clear how to use it and what I’d get, and I can trust that the data will come back in the same way each time.”

Continue reading What makes for a good API?