-
Engineering · Data Analytics
Improving Hugo stability and addressing oncall challenges through automation
Managing 4,000+ data pipelines demanded a smarter approach to stability. We built a comprehensive automation solution that enhances Hugo's monitoring capabilities, streamlines issue diagnosis, and significantly reduces on-call workload. Explore our architecture, implementation, and the impact of automated healing features. -
Engineering · Data Analytics
Building a Spark observability product with StarRocks: Real-time and historical performance analysis
Discover how Grab revolutionised its Spark observability with StarRocks! We transformed our monitoring capabilities by moving from a fragmented system to a unified, high-performance platform. Learn about our journey from the initial Iris tool to a robust solution that tackles limitations with real-time and historical data analysis, all powered by StarRocks. Explore the architecture, data model, and advanced analytics that enable us to provide deeper insights and recommendations for optimising Spark jobs at Grab.