Tag Archives: Open Data Camp

Making Open Data Camp matter – to local economies and more

What is the value to the local economy of open data – and open data unconferences? The wider benefit of open data to local economies is harder to quantify. There’s no E-MC^2 equation of open data benefit yet.

So let’s talk about unconferneces, and Open Data Camp in particular.

Local value of open data camp

Some organisers have a sense that it stimulates the economy, but no sense of how to measure that. There’s local sponsorship – so they’re expecting some return on that investment. It might be an opportunity to meet potential customers, or to improve their operational intelligence.

Corporate social responsibility is one reason people sponsor: it’s both a community benefit, but it also benefits companies to have a thriving open data ecosystem.

Escaping the gravity of the capital

Just NOT having it in London is a good thing. Holding events away from London can be an incentive.

There’s a distinction between the benefit of open data on the city, versus the value of an open data conference in the city. There’s a clear basic financial benefit to the city in terms of hotel rooms, food and entertainment from events, as long as people are prepared to travel to attend the event. One event in an area succeeding gives other confidence to happen.

Travel is not just about money, but also about time to travel there.

Buy in from the host city makes a big difference. The city saying “no” to some investment in an event can kill it. There needs to be some vision for the city of the benefit, so you can sell it to them.

Unconferences can be wooly as to what the benefit is. Open Data Camp is deliberately avoiding London, both because too many events happen in London, and because people can be resident to it. You get local character and flavour, and you get people who might never come to an event if it wasn’t near to them. And they still get national organisations coming – because they don’t get out of London much, and get to make connections with local projects.

Connections make benefits

Those connections can turn into valuable projects. You’re not just connecting geographies, but different forms of organisation. People who don’t do data can start to see it in a physical way – to understand the data that describes the city they can see around them.

Can we improve the outcomes by theming the event? Or would that corrupt unconferences? People tend to take advantage of the location to discuss local issues – like the interface/divided communities session at this event. And that can be very valuable – giving people insights into unexpected uses of data.

The Queen’s University, Belfast computing department is often empty of a weekend. Why not use the space for events like this. Let people come in and find out new things. Being physically in different places give opportunities to explore new technologies, like iBeacons or VR tech.

Look at the data you have, and the data you can get, and the technologies are coming along – and then the space to think about how to combine them. Ideas start at those sorts of meetings – and we need those case studies.

Catalysing other fields

Bring in other kinds of people – English Lit students could find open data techniques useful in extracting what they need from books. You can avoid massive wastes of time and effort by bring people together in a way that allows them to realise what they can offer to each other.

At the moment, Open Data Camps are open data people talking to open data people. Could we have a Friday where we open up our experts to other people. That means we could say we advised start-ups and students, and contributed to the economy.

Pre-activation of people – letting them plan for hacker spaces, or offering open data surgeries would be possible ideas.

We’re trying to capture the sessions via Drawnalism, and we’re putting that on the blog. But should we be pushing onwards wit it, telling case studies and stories around events or projects that spin out of open data camp sessions and meetings?

But what about the wider benefit of open data to local economies? There’s no E-MC^2 equation of open data benefit yet.

Session Notes

Whose data is it anyway?

The question of who data belongs to, and whether individuals can have a say in what happens to their data, tends to come up very quickly in some areas. Health, for example.

But there is a concern that the whole issue of data collection and use could become much more fraught with the arrival of the General Data Protection Regulation. This is an EU regulation, that is being incorporated into UK law at the moment, via the Data Protection Bill.

The GDPR will require organisations to think about the impact of projects on data privacy at an early stage and to appoint a data protection officer. It will introduce large fines for data breaches, tighten up rules on consent, and introduce some new rights; including a right to be forgotten.

The session heard this last right, introduced following a court case involving Google, could have a big impact on open data sets. Because if people remove themselves from datasets, they become less complete.

As the session leader, Kim Moylan, said: “What happens if people pluck themselves out of data? Do we leave a blank line, or just take what is there?”

“You will have to remove information that is no longer relevant. But what is no longer relevant in a medical record? What about the census? That is a big open data set, if people remove themselves from the census, then what do we do then?”

Aggregate data

One participant felt the public debate needed to be recast. Instead of talking about ownership, he said, the discussion should be about legal rights and restrictions. “If I walk through Belfast, I know will be filmed, but are legal safeguards on that. Talking about ownership just confuses the issue.”

However, the GDPR is coming in. And the general feeling in the session was that if it is going to cause problems for the open data movement, they will arise at the aggregation stage.

As various participants pointed out, when data is released as open data it is anonymised; the open data movement doesn’t deal in identifiable patient data, so it doesn’t need explicit consent to use the data it uses.

However, if people decide to opt-out of a particular data collection, or ask to be ‘forgotten’ and removed from a collection that has taken place, then that will affect the size, and potentially quality, of the dataset being released.

In which case, the big question is how many people will opt out or exercise their right to be forgotten. On this, opinions were divided. One participant pointed out that people already have rights to opt out of their medical data being used in shared care records and some data collections; and hardly any use it.

The Caldicott Review of information governance and security in the NHS will give people new opt-out rights; but there is no reason to think a lot of people will use them.

Practical problems

Still, the practical implications could be hard to deal with. One participant, who uses surveys to collect information asked whether someone who came back and said they no longer wanted their data to be included could ask for it to be removed at every level; the original survey, the aggregation, and the anonymised release.

The answer seems to be yes. “But philosophically, I have a real problem with this, because a policy decision might have been made on that data, and now it has changed.”

Also, surveys might need to be larger in the future, to make sure they would still be statistically valid if a predictable number of people removed their data later on.

Overall, though, the session was positive. It remains the case that most instances in which data is generated and used are covered by well-established legislation that will not be affected by the GDPR.

The Data Protection Bill builds on existing data protection legislation, which is reasonably well understood. Even the right to be forgotten is already two years old.

GDPR – good news?

Indeed, there is an argument that more debate about data protection, and more awareness of the new rules, can only be a good thing, because it will build public trust.

One participant said: “These conversations are happening more and more. We have privacy groups bringing cases. Privacy notices will have to be much more transparent. But I’m quite optimistic. I think once people understand their rights they will actually be more comfortable with uses of their data. The impact will be on companies that are not doing very well at the moment.”

 

The EveryPolitician project

Open Data Camp 5 is taking place in the Computer Science Department of Queen’s University. It’s a modern institituion that wants to make sure its students are ready for work.

So there are rooms that are carpeted with artificial turf, filled with trees, and furnished with garden benches. Of course there are.

The second session of the morning gathered in the garden room to discuss the EveryPolitician project [everypolitician.org], a bid to collect information about every politician in the world, anywhere.

Not an easy job…

The EveryPolitician site says that it has information on 74,939 politicians from 233 countries – and counting. However, Tony Bowden from mySociety, who started the project with Lucy Chambers, explained that this has been very difficult to collect.

The project started by “running a lot of scrapers on a lot of sites” but, because of licensing issues, “we couldn’t quite tell anybody they could use it” and “we weren’t sure it was sustainable.”

So, the project is moving to Wikidata, in the hope that this will become “the canonical source of information about politicians”.

Why Wikidata?

Bowden explained that Wikidata is connected to Wikipedia. There is no single Wikipedia; there is one for every country. So, Wikidata collects information from all the different pages, in a structured or semi-structured form, because otherwise they get out of sync with each other.

On the political front, for instance, Bowden said that if there is an election in an African country, local people will update that quite quickly, but the page in Welsh might not be updated for some time. The idea of Wikidata is to keep them aligned.

Still, collecting information “in a structured or semi-structured form” is not as easy as it sounds. For example, a session participant asked how EveryPolitician defined a political party, given that the idea will be fluid across different political systems.

Bowden acknowledged that, for EveryPolitician: “We came up with a simplified view of the world. We felt it was more important to have something good enough for every country to use, than to capture all the nuances.

“If you want to do comparative analysis across the world, you can’t start with a two-year anaysis of systems. It’s ok to say there will be some kind of political grouping.”

An evolving project

Bowden added that he thought these kinds of issues would be resolved, as people started to use the data. “We think it will be something like OpenStreetMap,” he said. “So, initially, there will be some broad concepts, but as it goes along, there will be people who go along and do every tree in their area, and the nuance will start to come through.”

Another issue is that there may not be a ‘single source of truth’ about politicians in some countries. For example, the Electoral Office of Northern Ireland knows who has been elected to a seat, but may not log changes  – for example, if someone stands down and somebody else is co-opted.

Or there might be official sites, but they might not be very good. Kenya only lists politicians with names starting with the initials a-m (and one person starting p). Nigeria’s politician pages are up to date, but its index is three months behind.

Bowden said EveryPolitician is building tools so that individuals can scrape official sites, and then upload the information to Wikidata, and fill in gaps or correct errors.

What is interesting, he said, is that once a country gets a good list, with people committed to maintaining it, that information tends to be much better – and better used – than official sites.

“If there is an election, the Parliament site might not update until Parliament sits, but the Wikidata pages will update overnight,”Bowden explained. “Then journalists can mine that for stories. So they can instantly tell people things like: ‘the youngest ever MP for Wales has just been elected’.”

Find out more:

The EveryPolitician website has collected 2.8 million pieces of information so far. Tony Bowden has explained the move to Wikipedia and its benefits in a blog post on the MySociety Website. He also Tweets about the EveryPolitician project. 

Announcing Open Data Camp 5

We are delighted to announce that Open Data Camp is returning once again. Open Data Camp 5 will be the weekend of 21/22 October at Queen’s University Belfast, in the Computer Science building

The Computer Science building at Queen’s University

We are really grateful to Queen’s University, and the School of Electronics, Electrical Engineering and Computer Science in particular,
for letting us use their magnificent Computer Science building, and to Suzanne and Cormac from OpenDataNI for making such a convincing case for Belfast to host our next event.

In case you’ve no idea what Open Data Camp is, here’s a quick recap:

Open

‘Open’ means that data has made available with little or no restriction on its use, as set out in a licence.

Data

‘Data’, refers to text, words, numbers, images, sound and video etc. (Hang on, what’s the difference between data and information? See this useful explanation.)

Camp

‘Camp’ is a term commonly used to refer to an ‘unconference’, which basically means it’s an event with no predefined agenda – instead, attendees ‘pitch’ session ideas to each other.

“Open data is data that anyone can access, use and share.”

More info to follow

We will let you have lots more information in the coming weeks, which will of course include details of ticketing, travel and accommodation.

Photo Credit

Cormac McConaghy

What makes for a good API?

One of the first questions to come up on day two of Open Data Camp was “what is an API?” One of the last issues to be discussed was “what makes a good API?”

 

Participants were asked for examples of application programming interfaces that they actually liked. The official postcode release site got a thumbs up: “It was really clear how to use it and what I’d get, and I can trust that the data will come back in the same way each time.”

Continue reading What makes for a good API?

A tale of two datasets

Controversially, Gavin Freeguard, head of data and transparency at the Institute for Government, was allowed a PowerPoint presentation at Open Data Camp 4. However, it was in a good cause.

 

His slides enabled him to give some concrete examples of the data in the Whitehall Monitoring Project, which he runs. The project monitors the shape and size of government, the morale of civil servants, and other factors.

Continue reading A tale of two datasets

What is data, open data… and what on earth is an API?

Day two of Open Data Camp in Cardiff opened with another session on the basics. What is open data, who can use it and what is it useful for?

More open data for newbies

Also, going back a step: “What is data?” Session participants suggested that while the public or ‘newbies’ might equate data with statistics, ‘data’ was much broader than that. It might be the raw data – or numbers – on which the stats were based. But it might also be text, or photographs.

Continue reading What is data, open data… and what on earth is an API?

Learning to love Linked Data

Linked data has been a topic of discussion at successive Open Data Camps. So at Open Data Camp 4 in Cardiff, Jen Williams of Networked Planet whipped through the basics.

Linked Data at Open Data Camp

“When people talk about linked data they are talking about putting it into a statement,” she said. “So in a normal spreadsheet, you have a lot of columns… with linked data you start with an identifier and then go to the column header, the ‘known as’, and then you go to the value. Continue reading Learning to love Linked Data