Merging Cloud Native and Legacy Data Source Using Spark by Rama Karmokar, Directline Group

Part of Direct Line Group’s cloud transition was to build a centralised data analytics platform to cater analytical and data science use cases. This platform needed to combine more than 35 years’ worth of data from legacy systems, as well as data originating from DLG’s new cloud-native trading platform in a single location. Doing so required the ingestion and merger of various kinds of data and source types, ranging from flat files, databases, multiple legacy SAS platforms to cloud-native data types and API sources. All this data then needed to be transformed, shaped and modelled into a format that made it intelligible and queryable for consumption by our analysts and data scientists.