SDSW23: The importance of a "data-first" approach
The CEO of 1Spatial, Claire Milverton, emphasises the importance of a "data first" approach in order to fully leverage the potential of AI and digital twins. She highlights three critical issues: buy-in from executives, data quality, and data maintenance. Milverton suggests solutions such as demonstrating the return on investment to gain executive support, automating data quality processes, and implementing data gateways to ensure only good quality data is used. She also introduces the 1Spatial platform, which helps organisations manage and improve their data.
Welcome to Smarter Data, Smarter World 2023 and it is lovely to see so many of you here today.
I was slightly concerned this morning when I turned on the breakfast telly and all the reports of “Everybody just stay at home” I was thinking - No! Everybody get on the train to Smarter Data, Smarter World.
So I'm going to start today with something quite interesting. So back in 2019, the Spanish government decided to invest in some shiny new trains for the northern part of Spain. Four years later, in 2023, earlier this year, this hit the national headlines. Spain officials quit over trains that were too wide for tunnels and this cost the Spanish government over €260 million. Quite an eye watering amount. Because of this, these lovely trains have now been delayed to 2026. And guess what the issue was? Why they didn't fit through the tunnels. It was because of the wrong track data. So all the focus had been put on these shiny new trains. They hadn't really thought about the basics. The data. And they were all here today to go through the topic of the future of data and I know a lot of us when we think of the future of data, the thing that first pops into our mind is artificial intelligence, digital twins. And yes, absolutely they are the future and they are important. But we often forget that their success is completely underpinned by good quality data. And that's why at 1Spatial, we take a data first approach and that's what we will be talking about this morning.
So what is a data first approach? Well, the golden pyramid summarises this very nicely. At the top you have all the advanced practices such as AI, but underneath it you can see it's underpinned by many, many components. And we will be having a look at a number of them during the course of today. But I'm going to cover what I think and what Spatial thinks are three critical issues in terms of a data first approach. I'm then going to have a look at some solutions to those issues and then take you through where we’re working with our customers on some of these issues.
So first of all, what are these issues? Well, the first one is Buy-in, and I'm actually going to go with Buy-in from the bosses. So the bosses actually do really like a lot of the shiny stuff, the digital twins, the artificial intelligence and often we go in to CTOs, but they're always worried about the systems first approach. So there is this thing about, you know, thinking about the systems rather than the data first. So really they're considering data second rather than data first. And how are they going to do these things. When you actually look at the stats that actually 75% of executives do not have a high level of trust in their data. So they're looking at this issue from the wrong end of the telescope. There's a real conflict here. They want the shiny stuff, but they don't want to invest in the data.
Vikram Chatterjee from Forbes really summarizes this nicely in his article that he wrote in February earlier this year. He says that in data quality the real bottleneck in AI adoption. Getting highly accurate AI relies on one thing- good quality data and a 90% accurate model might be alright for show and tell. But is it really good enough to have a one in ten issue when it's life threatening? I know a number of us in the room today deal with emergency services data or vulnerable people data or critical assets. Is it good enough that one in ten times it's going to be wrong? And he also makes the point about the justified fanfare and excitement. AI is still actually really hard to get right for real world applications. And we will look at that during the course of the day.
At 1Spatial, we really focus on a rules based approach to get data right. And that data is either right or wrong, depending on the rule. Then looking at data quality, the second key issue. Now we all know is the most important one, but unfortunately it is probably not the most interesting one. And often we find that there's people that work in silos using their own tools and methodologies to check the data and wrangle the data so they can get it into systems. But there's a big risk here that you're relying on one person to do this. When they leave and all that information is in their head. That's not a very good way to have good data governance across your organisation.
And then the next issue is around data maintenance. Companies spend a lot of money on getting their data clean, but if you don't maintain that data quality over time, all that investment is eroded. You really need to work out a sustainable approach to that data quality.
Okay, so let's have a look at some solutions to some of these issues. So Buy-in from the bosses. Well, you just have to sell them the data first vision and the only way you're going to do that is by return on investment and money - show them the benefits, get a data project going, a proof of concept or a trial. I can do that internally or get some external help and prove with one department or one dataset and then really empower that data owner and then demonstrate the return and investment from that project. Take a cut of the data at day one and then show it at day 30 to show the improvements and monetise the metrics around that improvement and really quantify the benefits of high quality data. I can assure you if you have enterprise data quality across your entire organisation, you can then use in multiple systems the economies of scale are vast.
Okay, so let's look at the data quality then. So I've got three words around data quality. It's automate, automate, and you've guessed it automate, automate absolutely everything you can, and then leave the bits that you can't to the people. I mentioned a minute ago, we use a rules based approach at 1Spatial to ensure data quality.
And I’ll give you a quick example of that now. So here we have two land parcels that are overlapping a boundary overlaps. So how I would create a rule to automate the checking of that, we would say a parcel of land should not overlap with another parcel of land. So if it does go and investigate it. But we might think to build some parameters into that because maybe we don't really care about all the overlaps, but maybe we care about one that's greater than 20 centimetres. So a parcel of land should not overlap with another parcel of land where the overlap is greater than 20 centimetres. And then you just investigate where the rules don't comply. So you're not sending loads of people out, checking loads of data. You only go and check where the rules don't comply. And then the next thing is a rules catalog. I can't emphasise how important we think this is. If you have a rules catalog to manage the rules over time, then if the person leaves, it doesn't matter because you've got a list of the rules nicely, succinctly laid out you can touch on to go on to the specific rule in the catalog, and then you can see what the detail is. This really stops this issue of silo working. You have visibility across your enterprise.
And then what about maintenance of data? Well, it's all about data gateways. Only let in the good data and information and keep out the bad stuff. And this is particularly important when you're bringing in data from a supply chain, whether that's internally and more importantly, probably externally. Okay so now let's have a look at some of the examples of where we're working with our customers on some of these issues.
But before I do that, I'm going to do a quick run through of the 1Spatial platform. Now, a number of you will be familiar with this, but there's a number of new people in the audience, so I'll just spend a few minutes on this. So at the lowest levels, we've got all the data that we can take in from different sources, spatial data and non-spatial data, and we don't care what system it’s in or what file format, we’re data and system agnostic. And then we bring that in through our 1Data Gateway and then we can get going on the data with our patented rules engine 1Integrate. And it's at this point we bring in the specific rules that the data must comply with. We can get going on the data, then we validate the data. We can then clean your data when it's incorrect. We can synchronise the data across your entire data estate. We can update your data for the changes as it's ongoing. And then we can do some observations on the data, which I'll cover a bit later. And then you can either put your data back into your sources and systems or send it wherever you want, because the data is now clean. Into whatever business application you want to put it into.
Want to discuss how a "data first" approach can enhance your organisation's performance at scale?
At 1Spatial, we really focus on a rules based approach to get data right.Speak to an expert
So the first one I'm going to cover to demonstrate Buy-in, is actually with our own product, 1Streetworks. So for those of you who don't know about one street works, it's a way to automate traffic management plans and what a traffic management plan is, if you want to dig up the road, you have to send to a local authority a little plan which says - This is where I'm going to dig out the hole,these are the traffic lights, these are the cones, these are the slowing down signs. And currently that's done very manually or it might be drawn in a CAD tool. But what we've done is automate the whole process and we've done it using our platform. So we bring in data from a number of different sources, including Ordnance Survey. We're actually taking in much more data now and it's a really rich platform but I'm going to leave that for Andy Fennell to talk about that later on. But we bring it into our platform and there is a rule book, so I talked about rules - so important - there's a red Rule book that says this is how the traffic management plans need to work. Once it's gone through that system and in about a few minutes you will have an automated traffic management plan, which in the past might take hours, days, weeks, months. Quite transformational.
They might think “Why am I talking about this for the buy in example?”. So we've been working with UK Power Networks over the last three months doing a trial with them and Surrey County Council to get that Buy-in. I'm really pleased to have Paul Dooley here today from UK Power Networks, who's going to be talking about that trial in more detail. And you know the whole purpose of that trial is to get the Buy-in, get the Buy-in from the people that are using it, and then get the Buy-in from the bosses as well about the metrics and the benefits to the organisation. So really looking forward to that presentation later on.
So that's one example of getting the Buy-in. Another example is with the National Underground Asset Register, and unfortunately, Chris Chambers can't be here with us today, but Amy is going to be taking his slot, so thank you very much, Amy, for doing that. But the reason why I brought this example to the floor for Buy-in is because this project just didn't go ahead because people decided it would be a good thing to do. There was a proof of concept for this in 2019 where the return on the investment had to be established before it would go forward and the Buy-in from the bosses. Here was the government. And this is looking to deliver at least 350 million of economic growth annually to the UK. And just really to complete this, we are also involved in this massive project from a data perspective. You know, we're bringing in the data here from all the utilities, local authorities, different asset providers. So there are over 650 asset providers that we're using our platform for this for bringing to it in through the 1Data Gateway and the rule book in this case are the specific NUAR rules. And then that goes into the NUAR system. And as I said, Amy will be talking about this in a bit more detail later on.
So there's two examples of Buy-in. Okay, so let's have a look at automate now. Now, this one I think is a fantastic example. The Environment Agency I think are really at the forefront of data governance and data management. So they've published their asset categories and their data requirements on their website because they bring in a lot of data from the supply chain. So if you supply, you know, exactly what you need to do. And just to expand on one of the categories defence, there you can see they give all the details about the floodgates and then all the attributes. And that's perfect for us at 1Spatial, because what we then do is we take that and we put it into the data catalog. So and the whizzy thing that we've really done with the Environment Agency is that we've done an automation. So every time they update their data in their system, it automatically updates our catalog. So we're completely in sync, we’re never letting the bad data in. Then you can see the rules down the side of the catalog. You can change them, you can update them, it's all synchronised and you've got the detail of the rule in each when you touch onto each individual rule itself.
So a fantastic example there of a data catalog. Now going to talk about some work that we've done with Hunter Water, which is an Australian utility. Now they've been undergoing a project for over a year, getting from their old ESRI system to the new ESRI system, the Utility Network Model, and they spent over a year and a lot of money trying to migrate their data to the system and they basically just couldn't do it. As I was saying before, these expert systems expect a very high level of data quality. If you don't have it, they don't work. And what we did with Hunter Water is we took their data and we ran it through some rules. Day one, we showed them - there are the issues with your data, off they went, fixed it. 30 days later - all green results and you can see the before and after. They just think our technology is fantastic and we're going to be working with them going forward. But just a really good example, you know, actually showing the Buy-in here as well. But, you know, a really good example of automate. And also what we can do with our platform as well is just do observations on the data. This is also useful for management and bosses and you can see, you know, what rules have failed the most. You can see how many times people have been in and out, the systems, you can see how many submissions they've done. So some really good sort of KPI data you can get as well.
Okay, then we're going to go on to maintenance and this really is an all around data gateway and rules and we're going to go over to the other side of the pond now. We're going to go over to the US, to the State of Michigan. For the State of Michigan, we're helping them create a spatial infrastructure for the state's needs for decision making. And how we're doing that is we're helping them bring in data from over a hundreds of suppliers and for 40 data layers - it's a really rich dataset. You know, all the different departments: Department of Transport, land parcel data, again, this is coming in through the data gateway and we're using the rules that they have for the specific data sets and their own Michigan rules. And you know, the thing here that's so important for Michigan is that they're not sitting at the centre doing all the data validation themselves. They're pushing it out to the supply chain so they get real economies of scale here, real cash benefits to them. So it goes through our platform, goes through our rules engine, and then they publish this and this is published now, and it's on an ongoing basis so they must keep this data up to date. So I mean, it's so important to have that data gateway in place. And the final example on the data gateway is for NG9-1-1 emergency services, and we're now doing this for eight states across the US. And the same format really in terms of how we do the configuration of our platform, bringing in data from all the cities and counties coming into our platform and there are specific rules here for emergency services data in the US called the NENA NG9-1-1 rules. And the thing that's pretty exciting that we're doing here and we've just launched is a SaaS platform, so we've put the vanilla NG9-1-1 rules into the cloud and what that enables the cities and counties to do is to validate their data as many times as they want to have control over their own data. So when they submit it up to the state, you know, it's all shiny and it's all nice and it's all cleansed. So, you know, this this is a really I think a really good way of checking the data at every juncture. And, you know, a really good configuration in terms of an architecture that I think could be set up in many places, because as you're always moving data around, there's always a chance of getting errors.
Okay, so what are the key aspects of a data first approach? So just to summarise: Number one, Buy-in, get the Buy-in from the bosses. Number two, data quality automation, rules and data catalogs. And number three, maintenance. Guard your data with a data gateway every step of the way. So I just want to take a few minutes now to think about the future, and I'm going to actually go back to the Street Works example. So currently we have around two and a half million street works in the UK every year, but we know that that's going to increase with electrification, fiber, more roads to maintain. There are estimates it is going to go up to 4 million street works in the year. So how are we going to actually manage that? We were already, you know, really at breaking point in some areas.
Well, let's think about a vision for the future. A vision where street works is fully automated, with something like 1Streetworks. Where you can create automated plans in minutes that you know, are compliant, that you can use for planning and delivery of the road works. And imagine that comes together with the National Underground Asset Register. So the operative that's digging the hole in the road also knows about all the pipes that are underground. So the risk of accidental strikes is less. And imagine you put all of this into the cloud, so the whole of the supply chain can see all of this. Wouldn't that be phenomenal. And wouldn't that bring significant economic value to the UK? And that really shows the benefit of a data first approach. And at 1Spatial, we devour difficult data. We love cleansing it, we love helping our customers. We want to help you and your bosses trust in the data and help you get a data first approach. So come and talk to us, you can come and talk to us any time. But we're here today until 7 o’clock this evening and we would love to speak to you. So that's it from me.
Thank you very much. And I hope you have an absolutely wonderful day.