A world-class transportation solutions provider based out of the US is specialized in national and regional trucking, offering custom LTL (less-than-truckload), Long-haul Trucking, Intermodal, Dedicated, and Bulk shipping services. Sparity helped them build secure, resilient, and flexible data lake solution to store, analyse, and generate reports from their vast sea of data.

ClientLogisticsServicesData Lake SolutionYear 2022

Key Challenges

  • With the data piling up, it became harder and harder for the logistics client to work with data kept in multiple different silos
  • They wanted to unify the data coming in from core administrative applications like cloud CRM systems, Master data systems, on-prem Warehouse Management systems and Customer engagement portals
  • As most of the data from business applications and other internal and external sources, such as websites, IoT devices, social media and mobile apps is semi structured or unstructured, they were looking at modern approach to load the data into a data lake for current requirements
  • They wanted to generate reports and view dashboards on real time data to ensure last mile delivery is met for better customer experience

Technologies

datalake
azure-synapse-analytics
Databricks_
powerbi

Solution

  • Helped them implement enterprise-wide data management strategy by deploying a data lake on cloud to pull and load as-is data(CSV, Excel files) from different systems into a single layer
  • Used ELT jobs to dump data into the data lake by converting the data extracts from multiple systems
  • Built data pipelines, scheduled data transfer initializations using Azure data factory to move on-prem data from MySQL, Oracle to Azure Data Lake
  • Leveraged Azure databricks for various data transformations and loaded data from data lake to Synapse Analytics using Azure Data factory
  • Leveraged Microsoft Power BI service for dashboarding and reporting
  • Scheduled daily data refreshes and automated the process of sending notifications when an error occurred

Benefits

  • Data scientists and engineers can now quickly construct data models for analytics applications
  • Seamless data flow across components in the data pipeline reduced latency in the entire cycle
  • End-to-end supply chain visibility with the availability of centralized data repository for better decision making