Modern uses of old data can be peculiar. None more so than the fact builders across Europe routinely access a Scottish-held archive of aerial photographs to see if their construction projects might run into Second World War bombs. 

Holding 30 million images, dating back almost a century, the National Collection of Aerial Photography is a world-renowned resource that is currently being digitised – and hosted by the Edinburgh International Data Facility (EIDF).

“It is a unique repository of data that is actually used commercially around Europe,” said Professor Mark Parsons, director of the Edinburgh Parallel Computing Centre (EPCC) at Edinburgh University, which developed the data facility.

“If you’re going to build a building in Germany, or any of the areas of central Europe that were heavily bombed in the Second World War, you use these datasets to understand what was dropped in the vicinity of the building you’re about to build,” he said.

The historic dataset which is being fully digitised, a process which could take another 20 years, is also being “tidied” up using modern algorithms, said Parsons, highlighting how new technologies can be applied to improve archival material.

Parsons, left, talked about this work as part of a presentation at Futurescot’s second annual Cloud, Data & AI: Transforming Public Services conference in March. 

He told delegates how the university’s vast data hosting platforms are being made available to healthcare researchers through the DataLoch project, via four regional “safe havens” in Edinburgh, Glasgow, Aberdeen and Dundee, which use live NHS operational data to help clinicians discover new insights into patient health.

As the data is “unconsented”, it never leaves the platform, but it is fully anonymised and protected to high cybersecurity standards. 

Parsons revealed that the supercomputing facilities available via EPCC face about 165,000 failed login attempts or attacks every day. “We’re just living in this very, very aggressive world that we’re having to deal with now,” he said.

He said that data projects exist for people who either want to analyse datasets that exist in certain environments, or for those who want to “expose” data that they’ve created. The former was the easy part. 

But he said: “As soon as you start exposing data and making it easier for people to access, you’re suddenly dealing with a whole load of cybersecurity issues that you need to care deeply about in the current world we live in.”

Notwithstanding the challenges, the centre has a long track record of working with industry in data science and supercomputing. 

In recent years, with government city region deal funding, the university has set up the £80 million EIDF – a private cloud environment which allows companies and the public sector to host various datasets – using software that allows them to “experiment” with technologies including AI and large language model (LLM) training.

He said: “Over time, what we’re doing has changed, but the fundamental idea has not; one of the key projects we’re supporting is a regional internet of things project run by a different group in the university. 

“That’s been very successful at taking the IoT out to schools, to get school kids interested in that, and working with local councils to add IoT devices around all the services that they provide.”

Another project is with Forestry and Land Scotland, helping the organisation discover new insights into storm damage following Storm Arwen in November 2021, using intelligent algorithms. 

Parsons said that data management issues with large projects are often complex, and anything requiring over 100 terabytes of storage requires “thought” before commencing. 

He also warned against the tendency in some public sector organisations to begin projects which have good intentions but little lasting impact.

“We need to deliver solutions, particularly with the public sector, that can then be used by the public sector,” he said. 

“That might mean sometimes that a company is providing that solution into the government or hospitals, or whatever, but we need to develop what we’re doing in such a way that it’s not just a one-off and I’ve done many, many medical projects over the years which were exactly like that.”

He added: “They were one-off, proof of concepts – which were great fun and I enjoyed doing them, but they’re not actually terribly useful, as they don’t transform our public services. 

“All of these things, trust, time, defining the problem, having a common purpose and context are really important.”

As well as the public sector, the EIDF is also keen to work with more small companies, as Parsons said: “We’re here to do good for the Scottish economy.”

Professor Ana Basiri, director of the Centre for Data Science and AI at Glasgow University, was also among the speakers at the conference. She said the Glasgow centre was set up to start working on some of the big societal challenges, describing how in the city life expectancy can drop by 15 years from the affluent west to the poorer eastern suburbs. 

She said that data can help policymakers start to focus more deeply on problems such as health inequalities, which are leaving some communities behind.

Basiri, left, said: “The reason that I like this is because grand challenges require collaborative effort. If they didn’t, and one discipline could solve it, or one sector could solve it, it would have been solved by now. It requires collective work. 

“So, we need to talk about data because I believe that any decision that is made needs to be backed by data, otherwise it is just an opinion,” she added. 

“And that’s great, you can have your own opinion, but you can’t act and measure the quality and impact of that [decision] if it is not backed by a quantifiable measure of data.”

However, before we unleash the power of AI, she warned that governments need to start actively thinking about regulation.

“Some people have got very skewed views about how AI is going to solve all my problems, or it is going to kill me and take my job,” she said. “Well, it’s probably something in between.”

Basiri added: “Government needs to do some sort of regulation which doesn’t stop the industry and the public sector from doing fast-changing AI, but make it more organised and standardised.

“I think this is really missing from ensuring the public receive the best services that they deserve. At the end of the day, it is the government that needs to ensure that the benefits outweigh the risks.”