Why a Rules Based plus a Machine Learning hybrid approach
Why a Rules Based plus a Machine Learning hybrid approach provides powerful Artificial Intelligence for Location Master Data Management
Before we look at the use of artificial intelligence (AI) to manage data, lets first touch on what we mean by AI and take a look at some of the techniques being used today.
In its simplest definition AI is the computerised ability to perform tasks commonly associated with human intelligence for example reasoning, discovering patterns, generalising knowledge and learning from experience. AI can be a misleading phrase though and maybe simulated intelligence or even simulated decisions might be a better description.
Traditionally, rule-based or expert systems have always been considered a part of AI although these days when people think of AI they are more likely referring to Machine Learning (ML). The difference between them is that in a rules-based system the rules are explicitly defined by experts, but in ML the rules are inferred automatically from possibly subtle patterns in data using approaches such as neural networks or deep learning.
There are a vast number of emerging applications for AI and some examples of these could be interpreting video feeds from drones carrying out visual inspections of infrastructure such as oil pipelines, organising personal and business calendars, responding to simple customer-service queries, coordinating with other intelligent systems to carry out tasks like booking a hotel at a suitable time and location, generating a model of the world from satellite imagery, the list goes on.
With ML approaches, the outcome is highly dependent on the quality and consistency of the data: If a system is learning from examples then those examples had better be clean, correct, and unbiassed. This is the first aspect where rules-based approaches complement ML: Rules can be used to validate and clean the example data to ensure that it is consistent, complete, and correct enough for learning from.
ML processes such as neural nets are ultimately based on numbers; all the inputs and outputs of the system are sets of numbers representing words in a language, pixel colours in an image or whatever other data is being processed. With Location Master Data Management, geospatial data is a key ingredient which brings additional possibilities for correlating and matching different data, for example to identify that a record in one system is the same one in another system. Rules-based processes can be used to encode geospatial data into numbers to be used by the ML process. These numbers represent metrics such as size, shape, proximity or other geospatial interactions with the surrounding data, enabling the rules-based approach to add a second aspect to ML processes: encoding the geospatial context as an input to machine learning.
The third aspect of integration of rules-based with machine learning techniques is for the high-level decision-making. Also called neuro-symbolic AI, this approach acknowledges that a combination of rules and machine learning is more powerful because it allows the guiding logic of a person explicitly defining rules alongside the fuzzy inference benefits of machine learning from lots of data. For example, a self-driving car would use machine learning to interpret from its sensors that an object ahead is a pedestrian. But the decision that ‘pedestrians must be avoided’ would not be inferred from analysing footage of other drivers, that should be an explicit rule provided by a human designer.
Master Data Management is about ensuring that data within an organisation is either centralised or is at least consistent and synchronised between different systems. This is especially important when industry data standards need to be met in order to achieve external data interoperability. To achieve this, the data needs to be cleaned and matched before being merged or synchronised. These tasks are more successful if AI techniques (both-rules based and machine learning-based) can be used.
There are many reasons why organisations choose to invest in a ‘Rules Based’ solution as part of their AI and Location Master Data Management approach.
We will look at a few of the benefits and touch on some of the design aspects to using our 1Spatial platform rules engine in this way.
Transparent and traceable
Although ‘Rules Based’ AI is a powerful method of automating data management processes, it is also one of the simplest artificial intelligence techniques for a business to adopt.
When validating data, a rule needs to answer questions in the form "Given an object, what does this rule require of the object in order for it to be valid, possibly by checking against other objects?". When changing the data, e.g. cleaning, matching, merging or inferring; a rule needs to specify “Given an object, what needs to change with the object or other related objects?”.
This means rules can be simple and – unlike with ML processes - transparent because they tell us what constitutes a valid object or what processing was applied to an object, making it easy to trace what the rule did from its definition. 1Spatial’s platform enables rules to be created using a no-code approach meaning they are easy to create, manage, interpret and collaborate across teams. Largely, anyone in the business can understand a rule, creating greater transparency. The no-code interface means no programming is required and there is no wait time for developers to make the changes required by the data team. You can find out more about how a ‘Rules Based’ approach is used in data management to validate and improve data in our recent blog.
Enable geospatial machine learning
The neuro-symbolic approach of using rules alongside machine learning brings the benefits of both: Explicit, traceable, transparent logic alongside the fuzzy inference for aspects which are subtle and hard to know how to encode as rules. The benefits that rules bring to machine learning, described above, mean that it now becomes possible to harness machine learning for geospatial data but with the transparent oversight of explicit rules.
Cost effective and successful AI in MDM projects
The decision to unlock the power of ML techniques on data may fail due to poor, biased or incomplete data. The use of automated rules helps to ensure a successful application of ML without requiring expensive manual data cleaning. This can greatly reduce the cost and risk of starting these machine learning projects and also ensure that the process is easily repeatable when the data evolves. This allows organisations to unleash powerful matching techniques to master data management without being locked into expensive, time consuming and non-repeatable manual data cleaning tasks.
Rules as Knowledge Management
The repeatable and traceable rules defined in a central, collaborative rules engine environment brings other benefits because the rules become a centralised knowledge management hub where the knowledge is not hidden away in people’s heads or locked up in software code. These rules can be bespoke to an organisation but also can reflect industry standards and be shared between organisations.
In conclusion, rules are a good thing for Location Master Data Management; both to enable and support a neuro-symbolic AI approach to introduce ML for better outcomes but also as a traceable, repeatable expert-system approach to synchronisation and data quality management of data.