SDSW23: Practical Implementation of Spatial Data Management
tl;dr
The speaker, Simon Ashby, discusses the practical implementation of spatial data management, emphasising the need to break down the complex problem into manageable chunks using models. He highlights the importance of data governance and outlines a model with 11 components for spatial data management.
Ashby then introduces a data lifecycle model, stressing the significance of archiving and destroying data. He connects these models to create a comprehensive framework for spatial data management implementation, addressing functions, people, processes, and tools for each component.
Additionally, he emphasizes the integration of business requirements with spatial data management and the significance of maintaining data quality throughout the lifecycle. The presentation emphasizes the need for a holistic approach to spatial data management implementation, integrating input from both top-down and bottom-up perspectives.
Transcript
Morning, everybody. I'm Simon Ashby, Enterprise Solution Architect at 1Spatial. And I'm going to talk about practical implementation of spatial data management. And really it's moving on from what Mark and Claire have been talking about. It's moving into how do we do the job, how do we address the challenge we've been given by the business to get value out of our data, to actually do this data management work? Data management is a big and complex problem. I'm sure you don't wake up in the morning and think data management. Oh dear. I'll go back to bed. I sometimes do that but the dog won't let me. But I get up in the morning I think data management, big and complex problem. How do I deal with it? I've got to find a way of breaking this problem down into manageable chunks. Chunks which when I get up in the morning, I'll go to work. I'll go, okay, I know what I'm going to do. I'm not going to solve this problem in one go. It's going to take time. It might take decades, might take years, might take months, but it's going to take time. So how do I do this. And this is where models come in.
I'll use a model based approach. But models come with health warnings. And that's where this slide is about. So those of a certain age will remember in 15th of October 1987, Michael Fish during the lunch time weather forecast 13:00 hours. As I say and Mark would know, that's just after lunchtime. When he assured the nation that everything was fine. Don't worry. You know, you might have heard about a hurricane. Really not a problem. We woke up the next morning and 15 million trees have been blown over. There was chaos in the country. Michael Fish based his forecast on a weather prediction model. That weather prediction model didn't work in that instance. So what we have to realise with models is they're not accurate. They're not 100 percent accurate. And a person called George Box recognised this and he came up with the saying all models are wrong. Some are useful. So when you use a model look at the utility of a model you are never going to get it 100 percent accurate. If you are. You're so much into that analysis, you're actually not doing the other job, which is data management.
So find a model which works for you, which you can utilise. Michael Fisher's model clearly didn't work for him. And the Met Office went away and they worked out why it didn't work for them. And they bought a really big computer, which is great for us, but they also went out and captured a lot more input data, and they updated their models. So that really big computer had something to do. But they solved the problem that should turn that model into something very useful. And now Mark is a beneficiary of that model when he was sailing down from UK to South America. And no, I'm not jealous at all Mark. Okay. So models models is where I when I get up in the morning I'll go, how do I solve this problem? I come to a model. The first model I think of is subdividing the big problem into more manageable chunks. This is the architect in me. So what are the main system building blocks in a spatial information system? And I tend to see three of them. There's the spatial data management block. There's the GIS block and then the enterprise, the ETL enterprise, the ETL and automation block.
Now this model is not accurate, but it's I can use it I can utilise and it works for me. So data management yes we have data management in GIS. Yes we have data management and spatial data ETL and automation. But I'm putting those to one side. I'm focusing on the spatial data management bit because that to me is a big challenge I’ve got to deal with. So I have this high level model, I've subdivided the problem into three more manageable chunks. I now move down to one layer further. And this is where Dama comes in. And the Dama model, the Dama model. Now help me divide that spatial data management chunk into another 11 chunks or components. And these components now allow me to start getting into the detail and starting to understand how can I practically implement the spatial data management solution. So I have this problem and now I've got 11 areas I can focus on. But if we look at this model we see data governance is at the centre. Data governance is a heart of what we do. If you haven't got data governance, the other 11 bits are just going to be running around, headless chickens, bit like my dog.
Data governance how do we get data governance going? We've got to get in there and people process and tools. We've got to think about what people process and tools we need for data governance. Once we start thinking about that, we can look at the other ones around the edge, the other ten. We've got our planning and design components, we've got data architecture, we've got data modelling, we've got data security. I've got to look at those three chunks at the start of a project or when I when we're starting a project out, to say what is the scope of the work, what is the design I'm coming up with? I've got foundational functions of data quality and metadata. What actually am I dealing with if I haven't got metadata? I don't know what data I've got. The number of times you go in and say, what data do you have? And someone says well ask Fred over there or ask Joe. They know because I manage that database. That's not how we should be working. It's go look at the data catalogue, do a discovery and work out who actually owns the data. That's what metadata’s about. And data quality once I've found that data, how good is it? Can I actually use it? We've all seen examples of data where we go phoar, I wonder who captured that. And data quality is critical. You've got to understand where are you starting from with your data quality. So I've got my eleven chunks. I've got my model. How do I actually use this model to go forward?
Want to discuss how a "data first" approach can enhance your organisation's performance at scale?
At 1Spatial, we really focus on a rules based approach to get data right.
Speak to an expertWell, I need to know how I can implement this over time. So the structural model I’ve just shown now needs to be implemented over time. Now we have another model, the data lifecycle model. And this is saying starting from the planning design stage all the way to the archive and destroying. Now if there's one area people tend to forget, it's the archive and destroy. Amazon loves it. Microsoft love it because you pay a lot of money for data being held on the server, which you never use. That data is just sitting there. It hasn't been archived, it hasn't been destroyed. But we must remember to archive and destroy. But starting out at the start, planning and design. So I'm looking at data quality for example, what do I need to do? What standards am I working to? What metadata standards am I capturing to make sure that I'm only capturing the data once and I capture the data in an efficient way, that I capture the right coordinate reference system? I'm in OSGB36, not in WGS84 for example. Mark, that's a mapping thing. Hope you noticed. Standards.
So am I using the right standard? Am I using the right quality? And again, we've all got examples where the wrong data, geospatial coordinate reference system was used and you end up with the wrong result. I ingest and stored I bring it into the system. Is that data actually going through that gateway that Claire mentioned? Have I got my data quality right. Is it coming into the enterprise geospatial database or my file geodatabase or whatever? And the correct quality? I can then use and use that data and publish results of that data with confidence i.e my analysis is correct. My database has got the right data in it. I can go and do clever stuff with it. And that's all what we want to do.
We want to do this clever stuff with geospatial professionals at the end of it. We love maps. We love knowing whether my boat is going to be ahead of the next boat, for example. But I've also got to maintain that data. And this gets back to making sure that I don't have time to decay on that data, that my data retains its quality, or it should improve quality and at the end, archive and destroy. But this is all underpinned by data governance. So if I tie these two models together, my timeline or my life cycle and my structural model, I end up with something like this, which is quite complex well it's very complex, and this is now getting down to the level of the spatial data management implementation. I'm getting down into being able to align functions or the components together to work out what I should be doing first, work out where that will then take me.
So at the outset, data governance. If I haven't got a governance team, how am I going to actually get this process going? Data architecture, data modelling, then design, data security? What have I got to do with those? What are the requirements coming from the data governance team that I've got to actually implement in that? What is that data strategy given by the business that I should be following? We have to realise, and this is what enterprise architecture is about, is linking business to technology. We are not working in isolation. We're not just managing spatial data in isolation. That spatial data has got to meet the business requirement, or else we won't be going and won't be getting the value from data. We won't be getting the ROI that Claire wishes that we get.
Again, data quality and metadata foundational functions is run throughout. If I don't know what I'm starting with, how do I know where I'm going? And then once I've got all this right, I then get into the life cycle areas. I get my data integration, interoperability, my warehousing. This is where it moves over into the GIS area, the GIS model. You know, the interfaces. Do I understand those correct interfaces? How do I work with the GIS team to say this database, needs to be set up in this way, so I can actually put my high quality data into it once it's gone through my spatial data management. Now in 15 minutes, I can't go through those 600 pages of the DAMA book of knowledge.
If anybody has read that cover to cover, put your hand up. Oh, sorry. I shouldn't have asked that question should I. There's always one. But well done. We'll be asking you questions later. Okay. What's on page 31? So, excellent book is an excellent reference book. So I get my functions, my people, my process, my tools for each one of those components, each one of those 11 components within the DAMA wheel. I have to work out what my people should be doing? For example, in governance, how do I set out my governance team? What processes should my governance team be doing? How do they report back to the business and what tools are they using in the governance team? It might be using web pages or reporting. If I look at data quality, my team might be more technical. My processes will be looking at using automation tools, and my tools will be trying to automate the processes that I'm meant to be using, which are rules based.
So I'm taking you from a journey from the top, from an analystic view down to the bottom. That is not the only direction. Nobody ever starts from a blank sheet. You start from the bottom up as well. There's probably someone in there doing it already, and that person's probably been bit like Mark's saying, been knocking on the door for years saying, look, this is really good stuff. I'd like this to be done. Now’s the time for us to be listening to those people and help raise their views up, while at the same time from the top down, helping set up the governance to help that data quality go through. Great. I will finish there and Mark, I'll hand over to you. Thank you very much.