SDSW23: Panel Session
tl;dr
Professionals from various organisations discuss about their roles and perspectives on data management. The participants include individuals from Ordnance Survey, the Central Digital Data Office, DAMA, and the Office for National Statistics.
They discuss the challenges and approaches to managing, sharing, and utilizing data, as well as the value and impact of data in both public and private sectors. The conversation also touches on the importance of data literacy, education, and the broader societal implications of data management.
The professionals highlight the need to focus on the real-world impact and application of data, beyond just the technical aspects.
Transcript
Seb Lessware (SL): The first thing I was going to do was get everyone to position who they work for, their organisation and their position on data. Do you consider you work for a data organisation. What is their position on data? I'll start with David.
David Henderson (DH): Thank you and good morning, everyone. David Henderson, the chief geospatial officer, Ordnance Survey. And I guess as an organisation, we do a bit of data, and my role within the organisation. One of the things I have responsibility for these days is the development of our core geospatial data assets. So that's the kind of technology estate and the data assets that form the national geospatial database. And, I'll expand a little bit on that perhaps after coffee, but during this as well.
SL: Okay. And same question to Jenny.
Jenny Brooker (JB): Hi this is Jenny Brooker, Chief data architect at the Central digital data office. So we're part of the Cabinet Office. We sit right at central government, with the aim to drive better use of digital and data across all of government and that is the widest, kind of sense of the word, so all the way through to the authorities and administrations and that whole gambit, and my role is very much around and we have our digital data roadmap, which sets out the different missions and, the one that is, really driven by our side of the organisation, is around better data to power decision making. And so there's a real, sense from that of the importance of data, and the, ambition that I guess government has to do that. And, particularly around data sharing, and also the use of kind of data in adoption of AI in the public sector.
SL: Thank you Jenny. How long has CDDO been around?
JB: 2021. So, we had our actually I know this, because we had a party for our, I think it was our second or third anniversary. So not very long, it was part of a recognition, we used to be part of, GDS The Government Digital Service, and it was a recognition of the focus needed kind of internally. So if GDS were looking at those citizen facing kind of services, it's that internal look at the actual of change government itself.
Mark Humphries (MH): Technology is continually improving. The volume of data is continually increasing, but there are certain, fundamentals which don't change over time. So from the Dama point of view, yeah, we’re very much, trying to capture the essence of that. The DMBoK in particular is not, it's not a recipe. Again, Simon talked about, you know, it's a great reference book. Very few people read it from cover to cover you know, I've not read it from cover to cover, I have to admit. But there's a lot of there's a lot of wisdom in there, understand that wisdom interpreting it into your, into your organisation is really important. And that kind of leads me seamlessly into, into my day job, which is, you know what, what I am and my consultants do is, we take our understanding of, you know, those fundamentals of data management, we add specific skills on data architecture, data quality, master data management, meta data management, those kind of things. and we turn that into practical solutions for our customers. So yeah, we are we're very much, you know, in both of my roles are very, very much, data orientated. I describe myself when I introduce myself, I do embarrass my sales team. I do introduce myself as a professional data geek. I'm very proud of that. And seeing all sorts of data from different industries, different sectors, the good, the bad and the ugly. So, yeah, I think it's, I think that the day, the day I decided to specialise in data is probably the best career decision I ever made. I would say I was in I was in data before it was sexy, you know, it is very much, you know, the career of choice these days.
SL: And the from your role in DAMA do you have a feeling for the spread of types of organisations that are members, are they sort of public or private sectors? Are they insurance?
MH: We've seen a lot of growth in the public sector. I'm very grateful to all civil servants for taking up the DAMA mantra and promoting it. I think I attribute that partly to A.There's an awful lot of very professional people in the UK civil service, who understand the importance of data and how it impacts their ability to serve citizens to provide services and to run the whole business of government. But I think also within the civil service, there's a lot of rotation. So people will move from one department to the other and they'll take best practice with them. And I think that's what we've seen over the last 3 or 4 years from the DAMA point of view, in terms of the growth membership. We’ve been able to see how some of our key members have moved from one department to the other then. Oh, suddenly, that department signs up for corporate memberships. So I think we have seen a lot of growth in the public sector. There is also, I think probably 50/50 public and private in terms of membership. I think in terms of the private sector, we see, insurance companies, we see a little bit of finance in there. It then starts to get a little bit mixed after that, it gets quite diverse. And, you know, we've got construction, we’ve seen more and more people coming in from sort of those geospatial areas the facility management and that kind of areas. A really good mix.
SL: Okay, okay. Thank you. And Olive to you as well your position on data.
Olive Powell (OP): Well for people who don’t know, ONS - The Office for National Statistics, we produce all the UK statistics. So effectively ONS is just data like which is great but its a challenge sometimes. We are interested in, producing statistics for anything that is societal, so that’s people, economies, running of businesses and farming. So effectively you look at the whole spectrum, which means it’s a lot of data, it’s a lot of expertise. It’s a lot of overlaps between teams as well. So that's quite challenging in itself. My job as the Head of Geography and Geospatial is to, look after how geography supports the production of statistics. And actually geography does a lot of that. You can't have statistics without geography. It also helps, analysing data and looking at different types of statistics, and kind of share that. So basically geospatial is key to statistics and we do data and geospatial data all across the UK.
SL: Thank you, so we've got a broad range from the start of the chain data providers, data supplies like with David. Right through to organisations who are advising on data or building software to help with data and I guess Olive is almost like one of the ender users, not the end user, but an end user of the data who's refining it and generating more information. One of the one of the issues that we see with data is it’s hard to predict how it's going to be used. You have the sort of obvious people asking for things and the obvious customer or next intended user, but often you're trying to predict, pre-empt things. What are the approaches you can take to do that? And I'll ask that to Jenny first.
JB: I think that's a really tricky one, because I think there's so many different ways in which data fits into those kind of end uses. We, you know, traditionally you just look at, there’s an analyst and there’s a data scientist. But actually when we've started to look at where is the value in data, it's often in, you know, something completely tangential to, you know, an analyst and those kind of things. And I think that's kind of data democracy that people are really driving their organisations that we've seen, both in kind of public sector, but also within industry and kind of private sector as well. That's a really big trend. And I think that has an effect on us understanding how that data is going to be used. And I think what we see particular with the data architecture perspective is that all the kind of traditional ways you do data modelling, like if you set up your architecture or you actually all of those things that are changing to support some of those practices that we've already seen in kind of product management, you know, you go to your user, you understand them first and you work backwards from that. And I think that's being supported now by the kind of architectural patterns that are coming up and the ways in which we do all of those kind of things.
SL: Thank you, and David, from an Ordnance Survey point of view, how do you do it?
DH: It's actually a really timely question. And, I suspect there's a few of our users in the room today, and I'd love to kind of, have conversations about exactly this over lunch. And in many respects, it used to be easier. And that's almost a heretic statement to make. But when the data you collected was to create a map and you were thinking about the cartographic output of that, and you were thinking about the relationship between aspects of data to produce a cartographic output. I'd maybe argue for the purposes of this panel at least, that life was easier. And I think today, we're now in a paradigm where our data is being used quite independently from the map. So the roads are being used for something, the woodland areas are being used for something, the buildings are being used for something quite different. And that's brought around a very different approach to how we govern our data, how we think about our data. But actually especially how we think about the end use of that data. And it's something which we've massively had to dial up, if you like, as an organisation in the last ten years, which is that sort of level of customer engagement and actually user engagement. As to how exactly is our data being used in practice? And in a way, I was going to cover some of this after tea break in a conversation, as part of the presentation. But I think there's something quite important, for us in that we don't tend to use our own data, and it makes it quite unusual, I suspect. And in many organisations, people tend to provision data for a process that they're ultimately undertaking their self. we're trying to provision the right foundational data at a national level to underpin a broad range of use cases, for a very, very broad range of customers, whether they're commercial, where they're for government or in wider society. And that actually just forces the conversation on to user engagement and user needs and to work out how best to provision data in a way that's useful for its intended application. The paradigm we're living through today is that in many respects our data wasn't collected, it wasn't maintained, wasn't developed with some of those end uses in mind. And so we're having to rapidly, re-engineer some aspects of our data, transform certain aspects of our data in order that it can do that.
I think it might have been Mark earlier on talked about, and certainly Simon did as well talk about this, paradigm of data sharing. And I think, you know, one of the things that we're particularly cognisant of is being the glue, if you like, between multiple systems of customer use. So what we hear our customers say is that actually above everything else. They want to share data more effectively between one another, and they want a common reference by which to do that. And you know, that in itself is a major consideration for us as to how do we describe a geography in such a way that allows people to have that common view? But like, that's a conversation to continue over lunch. I suspect we could probably all talk about it a lot longer, but yeah, it's a it's a big consideration for us for sure.
SL: Yeah. And we've seen that exactly in some of the projects we do where we might use OS master map as the source if there was a building and some of the information we might need for a decision is how old is this house? And that's not something you necessarily expect the survey to capture. It might come from the census or land registry, so you can't answer all the questions with the data, but if you provide it with a form where you can link the information from somewhere else, then you can combine it. It’s that, way of managing the features and the objects and their IDs opens up those doors. For sure. Mark, question to you I guess, does the DAMA book talk about that sort of approach. I mean, how do you capture requirements?
MH: I think yeah, I think that's a really good example of how the DMBoK should be interpreted. And so when you talk about data sharing and especially when you start talking about, using data for purposes for which it wasn't originally intended, I think there's a number of things in there which and I think, you know, governance is one of them. But then, things like architectural metadata is like, you know, you can use architecture to describe, you know, what does this data mean, so of course it’s definition. That's why it's really important to understand the definition. It's really important to understand the problems. Where does this data come from? And I think it's really important more and more as data is being shared to understand, what was the original primary purpose for this data. And it all sort of leads into the disciplines of data architecture and data governance, but also data quality. What does it mean? What was it originally collected for? What was its original purpose? What constraints were placed around it and how can it all come to be used? And I think one of the interesting things and, you know, straight away from, geospatial data into the realms of personal data, this becomes really, really important because, under what circumstances was data collected in the first place for the individual involved, what was the, what was their understanding of what you were going to do with their data? And do you actually have permission and there’s various things, various legal basis, which are very well defined. I think there's a lot of, the same when it comes to personal data, and this is something which affects a lot of, government data is, you know, what can you actually do with that data? I think this is one of the thornier problems when you talk about, sharing data is, and I put it into three categories is: What data can you share? What data can’t you share? And also very important, it's often overlooked. What data do you actually have an obligation to share? Because if you think of things like, sort of healthcare provision or emergency services, sometimes there's a, there's also legislation which talks about a duty of care and then that duty of care, that duty of care can lead into, for example, the need to share data so it's: can, can't and must. I think are three really important categories when you're looking at that. So there's again back to the DMBoK, there's different elements in there to pick out. And I think that part of the skill of an experienced data management consultant, data management professional rather. Yeah. And it's that data integration brings whole new power and problems like with the NUAR project, one of the issues or one of the fears is knowing where things are buried is great. So that when you dig up the road you don’t hit it. It's also a really powerful piece of information. If you're a terrorist, you know that the gas main and electricity cable crossing each other. So individually, lots of people have access to individual data sets. When you bring them together, they’re now more powerful so you have to be more careful about how they’re used.
Want to discuss how a "data first" approach can enhance your organisation's performance at scale?
At 1Spatial, we really focus on a rules based approach to get data right.
Speak to an expertSL: So Olive turning to you. What would you consider yourself to be a producer of data or simply a consumer? Or is it both?
OP: So yeah, interestingly I was thinking of all these, what David and Mark mentioned actually resonates quite well with ONS and I think you summed up the question there. The yeah we've got almost two roles and the production statistics I guess will provide and we answer questions that are raised by, you know, government over the next few weeks. And sometimes you have to guess what that question is going to be. But to enable the statistics to be produced, we need to be the consumer. So obviously collecting data for, what we hold already at ONS was census data, for example, but also from other partners to be able to make sure that through that data integration feed we get through the you know, I'm going to quote David here. I think and I'm really bad at quoting so I’m probably going to say it wrong, but something like we're better than the sum of all our parts. I’m sure David hears that all the time, but that's true. The data you're bringing data from different sources you kind of mash it together, you integrate it, you link it and you get to that place where you've got more powerful data. And you can answer those questions, those complex questions we couldn’t answer before because there were different sets of data. And so this is where I sit in data architecture, this is what my directorate does and it's about acquiring that information, making sure that it is fit for purpose. Is it okay to use and be used in a different way than we originally planned and then how do you kind of link it together? And how does that answer those questions? And in a way, by doing that data integration piece and being a consumer, for other customers, because effectively we're a middle-man. We have to think about how do we link data, how do we enhance that integration. Obviously geospatial has a big part of it. But even then it’s kind of you know. The other day we were talking about, how to answer the question about health, looking at air particles, and air quality, that kind of stuff, thermal surface level, data involved as a diversion. So how do you, you know, on a geography information system it’s very easy we’ve done it for years, but actually when you're trying to do that and scale it and use, modern ways of doing things, you know, it becomes a bit more complicated. So you need to think about mechanisms, about how you do that kind of stuff. And this is where it gets interesting. But this is where it gets stranded, so going back to your question, because we’re both, and in a way that's quite easy because we can see a whole like you know, the lifecycle of the data pipeline, you know, where it starts from and where it ends and we see the results. And I think how we’re very focused on our value of data. This is where you can actually realise the value, because you can see what question was answered that you couldn’t have answered before through that piece of work.
SL: Okay. So yeah, that value statement links back to what Claire was talking about and it's often quite hard to do that value measurement, I mean, sometimes you see an actual outcome - you're reducing a risk you're shortening times, you're stopping the bad things from happening. But often you need to take more of a long term view to justify that, that investment. And there you're sort of trying to predict how the data might be used. David, do you ever consider I know we've seen some statistics come out of Ordnance Survey so you know, the value of this data, but is that something that you're asked to think about how to measure the value.
DH: Yes. And I guess there's two for us. I mean, one is that kind of, reports are available to read. Some people in the room may have written them at times, but you know, the economic value of sort of national geospatial data and, you know, you can you can choose some big numbers within that, and some of them might be data. I mean, all of would be, faced with the same sort of footprint from statistical data, you know, what is the valuable value of sort of national statistical data to society at large. And, you know, there are methods for quantifying that, and certainly qualifying it, which are large numbers. I think, you know, what is always present for us as an organisation internally is just, how we continue to, quantify and qualify the value of sustained investment in geospatial data infrastructure. It’s not cheap to maintain data, to continue to improve it, to make a commitment to maintain it. And, you know, that business case for us is probably no different from any other organisation in the room who's maintaining a data asset for their own organisation. You know, our value comes from an expression of customer value. Are we doing the right things? Are we able to provision the right products? But at the same time, we're also driven by exactly the same metrics of others in terms of, you know, does it lead to a more efficient process? Does that allow us to collect data in the field or from remote sources? Are we able to ingest data efficiently? Do our data governance processes enable that? Are we able to create the right products and right services effectively and efficiently off the back of that? So those kind of value statements of how we do the data bit as well as what the data is in itself, are absolutely ongoing conversations. And I think the how bit is probably a very similar assessment of anyone else in the room.
SL: Yeah absolutely. And the, like you said that you mentioned there the two aspects, one is, what it cost us to recapture it all, that's the cost of it, I guess, as a national mapping agency, you have this sort of national obligation, but say Royal Mail as opposed to, say, a private company that could just pick major cities and forget the rest.
DH: Yeah. Although I often sort of say actually just doing some capture for the first time in a completely new environment where no pre-existing data occurs, is actually relatively simple, it’s relatively cheap. You have a specification you work with, you go at it and it's done, put it away, and you never have to come back to it. It's relatively simple, the complexity is when you have to make that interoperable with everything else and then maintain it. Yeah. And start to engage with users who then rely on it. And that's where the complexity in the cost comes from, and the kind of the architecture and everything that sits around that to provision that, is where the cost really sits. Yeah. So I think, you know, it's actually we do very little data capture today. I always think of it as being a data maintenance and a change organisation as much as anything else. That's the value we are really creating. And that's where the cost is hidden behind.
SL: Yeah. And then the other side of that which is the value to the users. So I guess Jenny from a you maybe have oversight of where the data is being used. Do you see people asking for help on how to measure the value of that data?
JB: I think there's enormous interest, and not necessarily always for the people who are generating the data, particularly if they're not data organisations. But there's enormous interest sometimes from others in terms of, that would be really valuable to us that data that HMRC, or DWP happens to kind of create. One of the big projects that we are working on is a service called the Government Data Marketplace which brings together, what we call kind of the shared data ethics. So, going to departments and saying, here's a model for how you think ownership should be done. And part of that is you identifying, here's the kind of assets that you create that are really valuable to others that we think should be shared in the public sector. And that will then create the metrics as well and we'll start seeing because it's, you know, a complete black box at the moment in terms of people sharing data with whom, are getting the best value out of that data. And if there are other things that fit, I guess kind of being unlocked, through just being able to discover that data in the first place. And back to the kind of data point earlier around, you know, having a purpose and, enabling some of that sharing frame to happen because I think, you know, back to what everyone kind of talked about earlier is when you start to bring data together, that's when you start to see the real value. And that increases now when you look at things like AI and large language models, and the ability for us and particularly as a kind of organisation of the whole of government, but you know wider within UK plc and those kind of things. What is the value of data that we are currently, you know, all sat on because it's just kind of a secondary thing to what we’re doing because we’re, you know, we are it's where the Coast Guard data of how they're saving money, you know, but there's a whole host of data that could help with understanding England's operation of the coast. And so those kind of bits of value, and those little nuggets, that's what we really are trying to surface. And I think we've seen really good examples of when that happens, when people are likely to really share data and it leads to better outcomes for, people who are, claiming their benefits or people who are avoiding benefits and things like fraud and risk and those kind of things. So we're seeing loads of great kind of tiny use cases, but we're really trying to look at how do we bring that together, and then how do we kind of show that value. And the last part of it then is current focuses you know, public sector organisations sharing. But actually is there an argument to say better discovery for industry and better methods for us to share back to that kind of we know how to value our data. I don’t think we do and I think there’s an argument for us to look at our limited data set that we drive industry or others to achieve them and kind of realise the value that we can.
SL: That’s dangerously like data governance and sort of talks about the, I guess, the soft skills needed because it's so much about the people in the organisation rather than the technology. Mark, do you have a take on making people data literate or helping to encourage things from soft skills point of view?
MH: Yeah, I think the soft skills is very important. I think, you know, we're to come back to this, this three layer model of there's the data professionals, and that's very well covered by, DEDA and that's continue to evolve, and should evolve and it will continue to evolve, which is great. But there's also the two other communities which is, leaders and the whole workforce, you know, so I think one of the really important things is ensuring that leaders across industry, across government, have a good understanding of what can and can't be done with data, good understanding of what some of the obstacles, what some of the challenges and blockers are. But also to understand, I think, you know, often we come across, senior leaders who, they're a little bit wary. You know, there's a couple of fundamental things, you know, a couple of fundamental things which you can sort of help them to understand things like, you know, the the inevitability of false positives and false negatives, you know, and what how you know, why they, as leaders need to get their head around that and understand that, however good the data is, however clever the technology is, you know, things like AI will always make mistakes and and they're role for it. So one of the key things for a senior leader is to understand, their role in terms of what is acceptable, what levels of errors are they prepared to accept, for example. And clearly, communicate that and also what they can expect in terms of investment. If I can just link back to that investment case. One of the things that, I find worked quite well, in terms of understanding business value and business value of data is say, okay, what is it you what is your fundamental mission as an organisation? What is it you are trying to deliver? And based on that, what are the questions which really keep you awake at night? What are the questions that you, if you had the answers to those questions, they would enable you to deliver whatever it is you do better, cheaper, faster, whatever. And then from that, so if those questions that are really important to you in your organisation to deliver your mission, where might the answers to those questions be hidden in the data? And that gets you very close. And then you get that very strategic link. And it's not necessarily quantified, but you can draw a direct link back to this is what we are doing as an organisation, whether that's saving lives, or you know, supporting people on benefits or you know, collecting taxes or, you know, defending the country or whatever, you know, linking it back to that. And the more you find then it's, the phrase I use is - not all data is equal. It allows you to focus on specific data sets, which enable you to do whatever it is you do.
SL: Right. And so, Olive going back to you on some of those aspects, do you find that your organisation has to educate people on how to use the information that you produce?
OP: Yes. So the long answer. Yeah. I mean, ONS obviously is a big driven organisation and I think, it's obvious that when you look at the news, you know, it's so easy to kind of misinterpret all kind of societies and statistics that there's a, there's a big it's a big function within ONS and you know, we've got an echo I think you have a team that just kind of checks for this. But also we always try to respond when it's been, misquoted. But similarly I think it's about that, education about how to use data and how to make sure that you've got the robust, methodology to explain or understand the caveats of working with data quality. When it's not, totally clear cut. ONS not long ago has been doing that one big thing plus government data literacy program and ONS were partners so we can contribute quite a few modules for that without raising awareness of data across government. Most civil service has to. I say most because, I think I already know about data. So I think kind of I was about to go back and do my seven hours of, one big thing training, but it's the, you know, the wider government and people that might not be aware so much about data ethnicity is still seven. But yeah, I think it's about providing, educating. And that's just not just the data in general that, you know, I think in general geospatial data that is more the data sources we have a geospatial element. And, you know, we've always split the two - data and geospatial data, but actually it's one big thing. We have to make sure that geospatial is represented in the data literacy program as much as possible.
SL: I'm trying to think of a name for it. And I'm saying like, don't call, don't say map, don't use the word map, don’t use the word map that's just use the word map everyone knows what it is, even though it's a new thing, it's a data set rather than a bunch of lines. Do what's your view or your take on talking to people about data and trying to shift their mind away from they’re used to seeing a map when they go hiking.
DH: Gosh. I wasn't around when we named OS Master map. There's a moment where I think the kind of liberation or liberalisation of data in society generally is really important at the moment. And I think, you know, maybe as a kind of almost a clinic closing comment related to the last conversation at the moment, I mean, here we are as a community of, combination probably of data professionals and geospatial professionals in the room. And, I mean, now is our moment, right? You know, you look around the world at the moment, there's some pretty big challenges out there. And there's fundamentally geographical and nature. And I think we can probably get quite obsessed with talking about value in a very sort of academic sense about quantifying value in monetary terms or efficiency terms. But, you know, there is a real why behind the work that we do, that I sort of sometimes feel we don't tell the story very well around, which gets to kind of, you know, what we what our data actually contributes and in real practice. And, and that's probably less about the data and more about the application, more about the use case, more about the impact that it that it has. And I think, you know, we can perhaps lose ourselves a little bit, by talking about the data itself rather than about the end game, and, you know, that in a way plays a little bit some of the things Mark was talking about earlier on and the always the carrot and never really the stick, you know, find where it can demonstrate a real impact. And it is about the data for us now, you know, the applications that are using it for sure at times, you know, map is really important. I don't think you'll ever hear me say that you know, a good visualisation isn't really important. And in the same way that a picture conveys a hundred words, a map will tell a story. But I think there's a, you know, it's really important that we understand the distinction between the two. And, you know, data delivers impact. And, you know, let's try and tell a story about the impact as much as we can.
SL: Right. Well, I'd like to thank everyone on the panel for joining, however you may have joined. A special mention also to the AV guys who have been jumping on that echo really well. So it worked. We didn't know if it would work for our teams, but it worked. So thank you to the panel.