The combined team then set to work to configure the data processing functionality within the Data Platform to provide the best possible outcome across the legacy datasets.
The key features and benefits of the Data Platform that were important in this project were that:
- Data can be integrated from any source systems, so that data from core systems could be matched together, validated against the embedded PAF file, and then enriched through more peripheral systems as needed
- Data Validation rules can be configured to apply new Data Standards and highlight those records with poor data quality
- Data Cleansing rules can be configured to automatically correct, enhance or rationalise source data records as they are processed into the Data Platform
- Any number of data sources can be merged together, based on any different sets of matching criteria to produce the best possible fit
- Where data volumes are exceptionally high, we can utilize Machine Learning models to perform Big Data processing (see note below)
- Data is matched to produce a complete picture of each property
- Our in-built data profiling functionality provides a fully flexible and automated method of measuring compliance against single and multi-record scenarios
- The Data Platform builds up a longitudinal view of data quality, so that the immediate outcomes of data cleansing can be managed as well as monitoring the longer-term resilience of data quality levels