-
Engineering · Design
Unveiling the process: The creation of our powerful campaign builder
Dive into Trident, our real-time event-driven marketing tool at Grab. Explore the build of the core units powering our If This, Then That (IFTTT) logic. Learn how we deal with complex campaigns and discover the secret behind how we support various processing mechanisms -
Engineering · Design
Chimera Sandbox: A scalable experimentation and development platform for Notebook services
Unleashing the potential of machine learning (ML) with Grab's Chimera Sandbox. This scalable platform facilitates rapid development and experimentation of ML solutions, offering deep integration with Large Language Models and a variety of compute instances. Discover how it's driving AI innovation at Grab. -
Engineering · Design
How we improved translation experience with cost efficiency
Dive into our journey of improving in-app translation experience amidst a post-COVID tourism boom. Discover how we overcame language detection hurdles, crafted an in-house translation model, and implemented stringent quality checks, all while maintaining cost efficiency. -
Engineering · Data Science
LLM-powered data classification for data entities at scale
With the advent of the Large Language Model (LLM), new possibilities dawned for metadata generation and sensitive data identification at Grab. This prompted the inception of our project aimed to integrate LLM classification into our existing data management service. Read to find out how we transformed what used to be a tedious and painstaking process to a highly efficient system and how it has empowered the teams across the organisation. -
Engineering
Profile-guided optimisation (PGO) on Grab services
Profile-guided optimisation (PGO) is a method that tracks CPU profile data and uses that data to optimise your application builds. The AI platform team enabled this on several Grab services to discover the full benefits and caveats of using PGO. Read this article to find out more. -
Engineering
How we evaluated the business impact of marketing campaigns
Discover how Grab assesses marketing effectiveness using advanced attribution models and strategic testing to improve campaign precision and impact. -
Engineering
No version left behind: Our epic journey of GitLab upgrades
Join us as we share our experience in developing and implementing a consistent upgrade routine. This process underscored the significance of adaptability, comprehensive preparation, efficient communication, and ongoing learning. -
Data Science · Engineering · Security
Ensuring data reliability and observability in risk systems
As the amount of data Grab handles grows, there is an increased need for quick detections for data anomalies (incompleteness or inaccuracy), while keeping it secure. Read this to learn how the Risk Data team utilised Flink and Datadog to enhance data observability within Grab’s services. -
Engineering · Data Science
Grab Experiment Decision Engine - a Unified Toolkit for Experimentation
Explore how the GrabX Decision Engine, an integral part of Grab's Experimentation platform, streamlines the testing of thousands of experimental variants weekly. This blog delves into how this internally open-sourced package institutionalises best practices in experimental efficiency and analytics, thereby ensuring accurate and reliable conclusions from each experiment. -
Engineering · Data Science
Iris - Turning observations into actionable insights for enhanced decision making
With cross-platform monitoring, a common problem is the difficulty in getting comprehensive and in-depth views on metrics, making it tough to see the big picture. Read to find out how the Data Tech team ideated Iris to turn observations into actionable insights for enhanced decision-making. -
Engineering
Android App Size at Scale with Project Bonsai
With the size of our app growing to include more features, Grab recognised it as a potential hurdle for new users with small storage capacities or restricted Internet bandwidth. Read to find out more about Project Bonsai and how it reduced app download size and app disk size. -
Engineering · Data Science
Enabling near real-time data analytics on the data lake
As the data lake landscape matures over the years, it presents opportunities to unlock more business value from the data. This correlates with the increased demand for flexible ad-hoc usage of fresh data. This article explores how we implemented data ingestion in Hudi table formats using Flink to meet this business demand. -
Engineering · Product · Design
The journey of building a comprehensive attribution platform
The Grab superapp offers a comprehensive array of services from ride-hailing and food delivery to financial services. This creates multifaceted user journeys, covering homepages, product pages, checkouts, and interactions with diverse content, including advertisements and promo codes. Read this to find out more. -
Engineering
Rethinking Stream Processing: Data Exploration
As Grab matures along the digitalisation journey, it is collecting and streaming event data generated from the end users of its superapp on a larger magnitude than before. Coban, Grab’s data-streaming platform team, is looking to help unlock the value of streaming data at an earlier stage of the data journey before this data is typically stored in a central location (“Data Lake”). This allows Grab to serve its superapp users more efficiently. -
Engineering · Data Science
Kafka on Kubernetes: Reloaded for fault tolerance
Dive into this insightful post to explore how Coban, Grab's real-time data streaming platform, has drastically enhanced the fault tolerance on its Kafka on Kubernetes design, to ensure seamless operation even amid unexpected disruptions. -
Engineering · Security
Championing CyberSecurity: Grab's bug bounty programme in 2023
Since its launch in 2015, Grab’s Bug Bounty programme has made strides in giving back to the global security community and aiding research. Read this article to find out more about our quarterly campaigns in collaboration with HackerOne and other achievements we’ve had in 2023. -
Engineering
Sliding window rate limits in distributed systems
In the field of distributed systems, there are several common challenges, such as rate limiters and fast queries in big data. In this blog post, we delve into how we address these challenges with sliding window rate limits to optimise marketing communications for our users. -
Engineering · Data Science · Product
An elegant platform
Supporting real-time data streaming enables our internal users to build intelligent applications and services, a crucial aspect of continuously out-serving our community. Read this article to understand our journey of building a real-time data streaming platform from pure Infrastructure-as-Code towards a more sophisticated control plane, and the benefits of this solution. -
Engineering · Data Science · Product
Road localisation in GrabMaps
With GrabMaps powering the Grab superapp we have the opportunity to improve our services and enhance our map with hyperlocal data. No matter the use case, road localisation plays an important role in Grab’s map-making process. However, road localisation entails handling a substantial volume of data, making it a costly and time-consuming endeavour. In this article, we explore the strategies we have implemented to drive down costs and reduce processing times associated with road localisation. -
Engineering · Security
Graph modelling guidelines
Graphs are powerful data representations that detect relationships and data linkages between devices. This is very helpful in revealing fraudulent or malicious users. Graph modelling is the key to leveraging graph capabilities. Read to find out how the GrabDefence team performs graph modelling to create graphs that can help discover potentially malicious data linkages. -
Engineering · Data Science
Scaling marketing for merchants with targeted and intelligent promos
Apart from ensuring advertisements reach the right audience, it is also important to make promos by merchants more targeted and intelligent to help scale their marketing. With Grab’s innovative AI tool, merchants can boost sales while cutting costs. Dive into this game-changing tool that’s reshaping the future of marketing and find out how the Data Science team at Grab used automation and made promo assignments a more seamless and intelligent process. -
Engineering · Data Science
Stepping up marketing for advertisers: Scalable lookalike audience
A key challenge in advertising is reaching the right audience who are most likely to use your product. Read this article to find out how the Data Science team improved advertising effectiveness by using lookalike audiences to identify individuals who share similar characteristics with an existing consumer base. -
Engineering · Data Science · Product
Building hyperlocal GrabMaps
Being hyperlocal is a key advantage for GrabMaps. In this article we will explain what being hyperlocal means and how it helps GrabMaps bring value to our driver-partners and passengers through the Grab platform. -
Engineering
Streamlining Grab's Segmentation Platform with faster creation and lower latency
Since 2019, Grab's Segmentation Platform has served as a comprehensive solution for user segmentation and audience creation across all business verticals. This article offers an insider look at the platform's design and the team's efforts to optimise segment storage, ultimately reducing read latency and unlocking new segmentation possibilities. -
Engineering · Security
Zero traffic cost for Kafka consumers
Grab's data streaming infrastructure runs in the cloud across multiple Availability Zones for high availability and resilience, but this also incurs staggering network traffic cost. In this article, we describe how enabling our Kafka consumers to fetch from the closest replica helped significantly improve the cost efficiency of our design. -
Engineering
Go module proxy at Grab
While consolidating code into a single monorepo has its benefits, there are also several challenges that come with managing a large monorepo like slow performance and low developer productivity. Find out how Grab’s FLIP team contributes and leverages the open-sourced Athens Go module proxy to improve developer productivity at Grab. -
Engineering · Security
PII masking for privacy-grade machine learning
Data engineers at Grab work with large sets of data to build and train advanced machine learning models to continuously improve our user experience. However, as with any data-handling company, dealing with users' data may present a potential privacy risk as it contains Personally Identifiable Information (PII). Read this article to find out more about Grab’s mature privacy protective measures and how our data streaming team uses PII tagging and masking on data streaming pipelines to protect our users. -
Engineering
Performance bottlenecks of Go application on Kubernetes with non-integer (floating) CPU allocation
At Grab, we have been running our Go based stream processing framework (SPF) on Kubernetes for several years. But as the number of SPF pipelines increases, we noticed some performance bottlenecks and other issues. Read to find out how this issue came about and how the Coban team resolved it with non-integer CPU allocation. -
Engineering
How we improved our iOS CI infrastructure with observability tools
After upgrading to Xcode 13.1, we noticed a few issues such as instability of the CI tests and high CPU utilisation. Read to find out how the Test Automation - Mobile team investigated these issues and resolved them by integrating observability tools into our iOS CI development process. -
Engineering
2.3x faster using the Go plugin to replace Lua virtual machine
The Talaria open-source project has made significant improvements by replacing Lua VM with the Go plugin resulting in 2.3x faster performance and memory usage reduction. Talaria is a time-series database designed for Big Data systems used to process millions of transactions and connections daily at Grab, requiring scalable data-driven decision-making. -
Engineering
Safer deployment of streaming applications
As Flink becomes more popular with real-time stream applications, we realise that Flink deployments are sometimes stressful and prone to errors. The Coban team deep dives into the issues with our existing Flink deployment process, possible mitigations, and the eventual solution to ensure safer deployments of Flink streaming applications. -
Engineering · Design
Message Center - Redesigning the messaging experience on the Grab superapp
Grab’s messaging feature was designed for two-party communications, but as our superapp grew to include more features, we became more aware of the limitations in our app. Read to find out how we redesigned the messaging experience to make it more extensible and future-proof. -
Engineering · Design
Evolution of quality at Grab
Testing is typically done after development is complete, which often results in bugs being discovered late in the process. Read to find out how Grab has improved its quality to scale and support the superapp experience. This evolution also brings a cultural shift for quality mindset in teams, enabling us to deliver faster with a better experience for our users. -
Engineering · Design
How OVO determined the right technology stack for their web-based projects
As companies grow in today's technology landscape, it often leads to a diverse set of technology stacks being used in different teams, which can lead to bigger problems in the future. Find out how the OVO team compared and analysed different technologies to find the one that best met their needs. -
Engineering · Security
Migrating from Role to Attribute-based Access Control
To ensure our consumers continue to be well-protected, we need to ensure our data access measures are compliant with evolving security standards. With more services and resources to manage, it becomes increasingly difficult to maintain a frictionless process. Read to find out how we solve this by migrating from role to attribute-based access control. -
Engineering
Securing GitOps pipelines
This article illustrates how Grab’s real-time data platform team secured GitOps pipelines at scale with our in-house GitOps implementation. -
Engineering · Product
New zoom freezing feature for Geohash plugin
Built by Grab, the Geohash Java OpenStreetMap Editor (JOSM) plugin is widely used in map-making, but a common pain point is the inability to zoom in to a specific region without displaying new geohashes. Read to find out more about the issue and how the latest update addresses it. -
Engineering · Security · Data Science
Graph service platform
Graphs are powerful data representations that detect relationships and data linkages between devices and help reveal fraudulent or malicious users. Learn how GrabDefence built the graph service platform to help discover potentially malicious data linkages. -
Engineering · Security
Zero trust with Kafka
In addition to ensuring the high performance and availability of our services, security continues to be one of our highest priorities. Read this article to find out how the Coban team enhances security by moving from network-based access control to zero trust with Kafka. -
Engineering · Product · Design
How KartaCam powers GrabMaps
The foundation for making maps lies in imagery and ensuring that it is fresh, high quality, and collected in an efficient yet low-cost manner. Read this to find out how the Geo team created KartaCam, how it addresses those concerns, and its future enhancements. -
Engineering · Data Science · Security
Graph for fraud detection
Fraud detection has become increasingly important in a fast growing business as new fraud patterns arise when a business product is introduced. We need a sustainable framework to combat different types of fraud and prevent fraud from happening. Read and find out how we use graph-based models to protect our business from various known and unknown fraud risks. -
Engineering · Data Science
Query expansion based on user behaviour
User behaviour data is a gold mine to gain insights about users and help us improve user experience. In this blog, we explore a query expansion framework based on user rewrite behaviour and how it improves user search experience and conversion. -
Engineering · Data Science · Security
Using mobile sensor data to encourage safer driving
Telematics is most commonly used to monitor vehicle movements and track driving safety, profiling, fleet optimisation and possible productivity improvements. Read this to find out more about how Grab uses telematics to encourage safer driving across our driver and delivery partner fleet. -
Engineering · Data Science
Automatic rule backtesting with large quantities of data
At Grab, real-time fraud detection is built on a rule engine. As data scientists and analysts, we need to analyse and simulate a rule on historical data to check the performance and accuracy of the rule. Backtesting, also known as Replay, enables analysts to run simulations of either newly-invented rules, or evaluate the performance of existing rules using past events ranging from days to months, and significantly improve rule creation efficiency. -
Engineering · Data Science
How we store and process millions of orders daily
The Grab Order Platform is a distributed system that processes millions of GrabFood or GrabMart orders every day. Learn about how the Grab order platform stores food order data to serve transactional (OLTP) and analytical (OLAP) queries. -
Engineering
How we automated FAQ responses at Grab
Most frequently asked questions (FAQ) are repetitive, which hinder on-call engineers' productivity. Read to find out how we automated FAQ responses at Grab, allowing engineers to focus on operational tasks. -
Engineering · Security · Data Science
Graph Networks - 10X investigation with Graph Visualisations
As fraud schemes get more complex, we need to stay one step ahead by improving fraud investigation methods. Read to find out more about graph visualisation, why we need it and how it helps with uncovering patterns and relationships. -
Engineering · Security · Data Science
How facial recognition technology keeps you safe
Facial recognition technology has grown tremendously in recent years due to the rise of deep learning techniques and accelerated digital transformation. Read to find out more about facial recognition technology in Grab and the components that help keep you safe. -
Engineering · Security
Graph concepts and applications
Graph theory-based approaches show the concepts underlying the behaviour of massively complex systems and networks. Read to find out how graphs came about, where they can be used and the part they play in graph technology. -
Engineering · Data Science
Automated Experiment Analysis - Making experimental analysis scalable
Analysts and data scientists invest lots of time into creating trustworthy experiments, which are key to making sound decisions. Read to find out how Automated Experiment Analysis helps make experimental analysis more scalable. -
Engineering
Search architecture revamp
Grab’s search architecture was initially designed to only support exact text matching based on user queries. Find out what problems the Deliveries search team faced and how they improved the search architecture to address these issues. -
Engineering
Embracing a Docs-as-Code approach
Read to find out how Grab is using the Docs-as-Code approach to improve technical documentation. -
Engineering · Security
Graph Networks - Striking fraud syndicates in the dark
As fraudulent entities evolve and get smarter, Grab needs to continuously enhance our defences to protect our consumers. Read to find out how Graph Networks help the Integrity team advance fraud detection at Grab. -
Engineering
How we reduced our CI YAML files from 1800 lines to 50 lines
GitLab and its tooling are are an integral part of the machine learning platform team stack, for continuous delivery of machine learning. One of our core products is MerLin Pipelines. We were reaching certain limitations of GitLab for large repositories by way of includes and nested gitlab-ci YAML files. -
Engineering
How Kafka Connect helps move data seamlessly
Grab’s real-time data platform team (Coban) covers the importance of moving data in and out of Kafka easily and how Kafka Connect helps with that. -
Engineering
Supporting large campaigns at scale
Running batch jobs targeting a large user base is a challenge. Find out how we designed our system to tackle the challenge at scale. -
Engineering · Data Science
How telematics helps Grab to improve safety
Coupled with data science, telematics can help to detect traffic events such as harsh braking and unsafe lane changes so we can provide a safer experience for our users. Read on to find out more about the challenges faced and how we addressed them with telematics. -
Engineering · Data Science
Real-time data ingestion in Grab
When it comes to data ingestion, there are several prevailing issues that come to mind: data inconsistency, integrity and maintenance. Find out how the Caspian team leveraged real-time data ingestion to help address these pain points. -
Engineering
Abacus - Issuing points for multiple sources
Learn about the challenges of points rewarding and how GrabRewards Points are rewarded for different Grab offerings. -
Engineering
Exposing a Kafka Cluster via a VPC Endpoint Service
Establishing communications between cloud resources that are hosted on different Virtual Private Clouds (VPC) can be complex and costly. Find out how the Coban team used a VPC Endpoint Service to expose an Apache Kafka cluster across multiple Availability Zones to a different VPC. -
Engineering
Search indexing optimisation
Learn about the different optimisation techniques when building a search index. -
Engineering
Automating Multi-Armed Bandit testing during feature rollout
Find out how you can run an automated test and simultaneously roll out a new feature. -
Engineering
Protecting Personal Data in Grab's Imagery
Learn how Grab improves privacy protection to cater to various geographical locations. -
Engineering
Processing ETL tasks with Ratchet
Read about what Data and ETL pipelines are and how they are used for processing multiple tasks in the Lending Team at Grab. -
Engineering
App Modularisation at Scale
Read up to know how we improved our app’s build time performance and developer experience at Grab. -
Engineering
Debugging High Latency Due to Context Leaks
Learn how the Marketplace Tech Family debugged and resolved Market-Store's high latency issues. -
Engineering
Building a Hyper Self-Service, Distributed Tracing and Feedback System for Rule & Machine Learning (ML) Predictions
Find out how the Trust, Identity, Safety, and Security (TISS) team improved machine learning predictions with Archivist, an in-house built solution. -
Engineering
Our Journey to Continuous Delivery at Grab (Part 2)
Read more about our long awaited piece on the automation work we have made through integration and hermeticity. -
Engineering
How We Improved Agent Chat Efficiency with Machine Learning
Read to find out how Customer Support Experience's Phoenix live chat team improved agent chat efficiency with machine learning. -
Engineering
How Grab Leveraged Performance Marketing Automation to Improve Conversion Rates by 30%
Read to find out how Grab's Performance Marketing team leveraged on automation to improve conversion rates. -
Engineering
One Small Step Closer to Containerising Service Binaries
Learn how Grab is investigating and reducing service binary size for Golang projects. -
Engineering
Customer Support Workforce Routing
Read how we built our in-house workforce routing system at Grab. -
Engineering
Serving Driver-partners Data at Scale Using Mirror Cache
Find out how a team at Grab used Mirror Cache, an in-memory local caching solution, to serve driver-partners data efficiently. -
Engineering
Trident - Real-time Event Processing at Scale
Find out where the messages and rewards come from, that arrive on your Grab app. Walk through scaling and processing optimisations that achieve tremendous throughput. -
Engineering
Pharos - Searching Nearby Drivers on Road Network at Scale
Learn how Grab stores driver locations and how these locations are used to find nearby drivers around you. -
Engineering
How Grab is Blazing Through the Superapp Bazel Migration
Learn how we planned and started migrating our superapp to Bazel at Grab. -
Engineering
Democratising Fare Storage at Scale Using Event Sourcing
Read how we built Grab's single source of truth for fare storage and management. In this post, we explain how we used the Event Sourcing pattern to build our fare data store. -
Engineering
Keeping 170 Libraries Up to Date on a Large Scale Android App
Learn how we maintain our libraries and prevent defect leaks in our Grab Passenger app. -
Engineering
Optimally Scaling Kafka Consumer Applications
Read this deep dive on our Kubernetes infrastructure setup for Grab's stream processing framework. -
Engineering
Our Journey to Continuous Delivery at Grab (Part 1)
Continuous Delivery is the principle of delivering software often, everyday. Read more to find out how we implemented continuous delivery at Grab. -
Engineering
Uncovering the Truth Behind Lua and Redis Data Consistency
Redis does not guarantee the consistency between master and its replica nodes when Lua scripts are used. Read more to find out why and how to guarantee data consistency. -
Engineering · Data Science
Securing and Managing Multi-cloud Presto Clusters with Grab’s DataGateway
This blog post discusses how Grab's DataGateway plays a key role in supporting hundreds of users in our entire Presto ecosystem - from managing user access, cluster selection, workload distribution, and many more. -
Engineering
Go Modules- A Guide for monorepos (Part 2)
This is the second post on the Go module series, which highlights Grab’s experience working with Go modules in a multi-module monorepo. Here, we discuss the additional solutions for addressing dependency issues, as well as cover automatic upgrades. -
Engineering · Data Science
The Journey of Deploying Apache Airflow at Grab
This blog post shares how we designed and implemented an Apache Airflow-based scheduling and orchestration platform for teams across Grab. -
Engineering
How We Built Our In-house Chat Platform for the Web
This blog post shares our learnings from building our very own chat platform for the web. -
Engineering
Go Modules- A Guide for monorepos (Part 1)
This post is the first in a series of blogs about Grab’s experience with Go modules in a multi-module monorepo. Here, we discuss the challenges we faced along the way and the solutions we came up with. -
Engineering
Tackling UI Test Execution Time Imbalance for Xcode Parallel Testing
This blog post introduces how we use Xcode parallel testing to balance test execution time and improve the parallelism of our systems. We also share how we overcame a challenge that prevented us from running the tests efficiently. -
Engineering
Returning 575 Terabytes of Storage Space to Our Users
This blog explains how we measured and reduced our app's storage footprint on user devices. -
Engineering
Grab-Posisi - Southeast Asia’s First Comprehensive GPS Trajectory Dataset
This blog highlights Grab's latest GPS trajectory dataset - its content, format, applications, and how you can access the dataset for your research purpose. -
Engineering
How We Prevented App Performance Degradation from Sudden Ride Demand Spikes
This blog addresses how engineers overcame the challenges Grab faced during the initial days due to sudden spike in ride demand. -
Engineering
Plumbing At Scale
This article details our journey building and deploying an event sourcing platform in Go, building a stream processing framework over it, and then scaling it (reliably and efficiently) to service over 300 billion events a week. -
Engineering
Journey to a Faster Everyday Superapp Where Every Millisecond Counts
This post narrates the journey of our performance improvement efforts on the Grab passenger app. It highlights how we were able to reduce the time spent starting the app by more than 60%, while preventing regressions introduced by new features. -
Engineering
Marionette - Enabling E2E User-scenario Simulation
Do you know how we get early feedback on any breaking changes? Read through our blog to find out how Marionette, an in-house simulation platform, detects breaking changes in booking workflows. It even generates resources for running simulations and facilitates the testing of microservices powering our driver and passenger apps. -
Engineering
How We Implemented Domain-Driven Development in Golang
Are you curious how we quickly enabled our partners to self-service using our platform? Have you wondered how some teams at Grab implemented domain-driven development while using Golang? Read this blog post to know more. -
Engineering
Griffin, an Anti-fraud Risk Rule Engine Making Billions of Predictions Daily
This blog highlights Grab’s high-performance risk rule engine that automates the creation of rules to detect fraudulent activities with minimal efforts by engineers. -
Engineering
Using Grab’s Trust Counter Service to Detect Fraud Successfully
This blog introduces Grab’s Trust Counter service for detecting fraud. It explains how the solution was designed so that different stakeholders like data analysts and data scientists can use the Counter service without any manual intervention from engineers. The Counter service provides a reliable data feed to the data science world. -
Engineering
Being a Principal Engineer at Grab
Curious about what a Principal Engineer role at Grab entails? Our Principal Engineers' responsibilities range from solving complex problems, taking care of the system-level architecture, collaborating with cross-functional teams, providing mentorship, and more. -
Data Science · Engineering
Data First, SLA Always
Introducing Trailblazer, the Data Engineering team’s solution to implementing change data capture of all upstream databases. In this article, we introduce the reason why we needed to move away from periodic batch ingestion towards a real time solution and show how we achieved this through an end to end streaming pipeline. -
Engineering
How We Built a Logging Stack at Grab
This blog post explains what we did to solve our inhouse logging problem around the lack of visualizations and metrics for our service logs. -
Engineering · Data Science
Catwalk: Serving Machine Learning Models at Scale
This blog post explains why and how we came up with a machine learning model serving platform to accelerate the use of machine learning in Grab. -
Engineering
React Native in GrabPay
This blog post describes how we used React Native to optimize the Grab PAX app. -
Engineering
Preventing Pipeline Calls from Crashing Redis Clusters
This blog post describes Grab’s post-mortem findings for the outage caused by the Redis Cluster failure. -
Engineering
Loki, a Dynamic Mock Server for HTTP/TCP Testing
Read our blog to know how Loki, a dynamic mock server, makes local box testing of mobile apps easy, repeatable, and exhaustive. It supports both HTTP and TCP protocols and can provide dynamic runtime responses. -
Engineering
Designing Resilient Systems Beyond Retries (Part 3): Architecture Patterns and Chaos Engineering
This post is the third of a three-part series on going beyond retries and circuit breakers to improve system resiliency. This whole series covers techniques and architectures that can be used as part of a strategy to improve resiliency. In this article, we will focus on architecture patterns and chaos engineering to reduce, prevent, and test resiliency. -
Engineering
Designing Resilient Systems Beyond Retries (Part 2): Bulkheading, Load Balancing, and Fallbacks
This post is the second of a three-part series on going beyond retries to improve system resiliency. We’ve previously discussed about rate-limiting as a strategy to improve resiliency. In this article, we will cover these techniques: bulkheading, load balancing, and fallbacks. -
Engineering
Designing Resilient Systems Beyond Retries (Part 1): Rate-Limiting
This post is the first of a three-part series on going beyond retries to improve system resiliency. In this series, we will discuss other techniques and architectures that can be used as part of a strategy to improve resiliency. To start off the series, we will cover rate-limiting. -
Engineering
Context Deadlines and How to Set Them
This blog post explains from the ground up a strategy for configuring timeouts and using context deadlines correctly, drawing from our experience developing microservices in a large scale and often turbulent network environment. -
Data Science · Engineering · Product · Design
Recipe for Building a Widget: How We Helped to “Peak-Shift” Demand by Helping Passengers Understand Travel Trends
We help to “peak-shift” demand by helping passengers understand travel trends with Grab’s data. Curious to know how we empower our passengers to make better travel decisions? Read on! -
Engineering
Structured Logging: The Best Friend You’ll Want When Things Go Wrong
This blog post describes how we built a structured logging framework that integrates well with our existing Elastic stack-based logging backend, allowing us to do logging better and more efficiently. -
Engineering
How We Simplified Our Data Ingestion & Transformation Process
This blog post describes how Grab built a scalable data ingestion system and how we went from prototyping with Spark Streaming to running a production-grade data processing cluster written in Golang. -
Engineering
A Lean and Scalable Data Pipeline to Capture Large Scale Events and Support Experimentation Platform
This blog post focuses on the lessons we learned while building our batch data pipeline. -
Engineering
Designing Resilient Systems: Circuit Breakers or Retries? (Part 2)
Grab designs fault-tolerant systems that can withstand failures allowing us to continuously provide our consumers with the many services they expect from us. -
Engineering
Querying Big Data in Real-time with Presto & Grab's TalariaDB
In this article, we focus on TalariaDB, a distributed, highly available, and low latency time-series database that stores real-time data. For example, logs, metrics, and click streams generated by mobile apps and backend services that use Grab's Experimentation Platform SDK. It "stalks" the real-time data feed and only keeps the last one hour of data. -
Engineering
Designing Resilient Systems: Circuit Breakers or Retries? (Part 1)
Grab designs fault-tolerant systems that can withstand failures allowing us to continuously provide our consumers with the many services they expect from us. -
Engineering
Orchestrating Chaos Using Grab's Experimentation Platform
At Grab, we practice chaos engineering by intentionally introducing failures in a service or component in the overall business flow. But the failed’ service is not the experiment’s focus. We’re interested in testing the services dependent on that failed service. -
Engineering
Reliable and Scalable Feature Toggles and A/B Testing SDK at Grab
Grab’s feature toggle SDK provides a dynamic feature toggle capability to our engineering, data, product, and even business teams. Feature toggles also let teams modify system behaviour without changing code. Developers use the feature flags to keep new features hidden until product and marketing teams are ready to share and to run experiments (A/B tests) by dynamically changing feature toggles for specific users, rides, etc. -
Engineering
Mockers - Overcoming Testing Challenges at Grab
Sustaining quality in fast paced development is a challenge. At Grab, we use Mockers - a tool to expand the scope of local box testing. It helps us overcome testing challenges in a microservice architecture. -
Engineering
How We Designed the Quotas Microservice to Prevent Resource Abuse
Reliable, scalable, and high performing solutions for common system level issues are essential for microservice success, and there is a Grab-wide initiative to provide those common solutions. As an important component of the initiative, we wrote a microservice called Quotas, a highly scalable API request rate limiting solution to mitigate the problems of service abuse and cascading service failures. -
Engineering
Building Grab’s Experimentation Platform
At Grab, we continuously strive to improve the user experience of our app for both our passengers and driver-partners. To do that, we’re constantly experimenting, and in fact, many of the improvements we roll out to the Grab app are a direct result of successful experiments. -
Engineering
Introducing Grab-Kit: Distributed Service Design at Grab
As we evolved from a single monolithic application to a microservices-based architecture, we were faced with a new challenge. How do we support exponential growth while maintaining consistency, coordination, and quality? -
Engineering
Deep Dive into Database Timeouts in Rails
Disaster strikes when you do not configure timeout values properly. In this post, we dive into the details of how timeouts work with Ruby on Rails and Databases. -
Engineering
Dealing with the Meltdown Patch at Grab
The meltdown attack reported recently had far reaching implications in terms of security as well as performance. This post is a quick rundown of what performance impacts we noted as well as how we went on to mitigate them. -
Engineering
The Art of Hiring Good Engineers
Hiring the first five good engineers in your team requires a different approach to hiring the first twenty good engineers. The approach to designing this process will be even more different, when you want to hire to scale up to a 100 Engineers... or even to 300. -
Engineering
Migrating Existing Datastores
At Grab we take pride in creating solutions that impact millions of people in Southeast Asia and as they say, with great power comes great responsibility. As an app with 55 million downloads and 1.2 million drivers, it's our responsibility to keep our systems up-and-running. Any downtime causes drivers to miss earning and passengers to miss their appointments. -
Engineering
So You Need to Hire Good Engineers
If you are in a fast growing tech startup, you're probably actively interviewing and hiring engineers to scale teams. My question to you is, what hiring strategy are you using when interviewing engineering warriors? -
Engineering
Come and #hackallthethings at Grab
For the longest time, security has been at the center of our priorities. There’s nothing more self-evident about the trust our millions of driving partners and consumers put in Grab. We strive every day to build the best tools available to ensure their data stays secure. -
Engineering
How We Scaled Our Cache and Got a Good Night's Sleep
Caching is arguably the most important and widely used technique in computer industry, from CPU to Facebook live videos, cache is everywhere. -
Engineering
Grab's Front End Study Guide
Grab is Southeast Asia (SEA)’s leading transportation platform and our mission is to drive SEA forward, leveraging on the latest technology and the talented people we have in the company. As of May 2017, we handle 2.3 million rides daily and we are growing and hiring at a rapid scale. To keep up with Grab’s phenomenal growth, our web team and web platforms have to grow as well. Fortunately, or unfortunately, at Grab, the web team has been keeping up with the latest best practices and has incorporated the modern JavaScript ecosystem in our web apps. -
Engineering
DNS Resolution in Go and Cgo
This article is part two of a two-part series. In this article, we will talk about RFC 6724 (3484), how DNS resolution works in Go and Cgo, and finally explaining why disabling IPv6 also disables the sorting of IP Addresses. -
Engineering
Driving Southeast Asia Forward with AWS
My name is Arul Kumaravel, VP of Engineering at Grab. Grab's mission is to drive Southeast Asia (SEA) forwards. Today I would like to share with you how AWS is helping us with this mission. -
Engineering
Troubleshooting Unusual AWS ELB 5XX Error
This article is part one of a two-part series. In this article we explain the ELB 5XX errors which we experience without an apparent reason. We walk you through our investigative process and show you our immediate solution to this production issue. In the second article, we will explain why the non-intuitive immediate solution works and how we eventually found a more permanent solution. -
Engineering
Scaling Like a Boss with Presto
A year ago, the data volumes at Grab were much lower than the volume we currently use for data-driven analytics. We had a simple and robust infrastructure in place to gather, process and store data to be consumed by numerous downstream applications, while supporting the requirements for data science and analytics. -
Engineering
Deep Dive into iOS Automation at Grab - Continuous Delivery
This is the second part of our series "Deep Dive into iOS Automation at Grab", where we will cover how we manage continuous delivery. As a common solution to Apple developer account device whitelist limitation, we use an enterprise account to distribute beta apps internally. There are 4 build configurations per target. -
Engineering
Deep Dive into iOS Automation at Grab - Integration Testing
This is the first part of our series "Deep Dive Into iOS Automation At Grab", where we will cover testing automation in the iOS team. Over the past two years at Grab, the iOS passenger app team has grown from 3 engineers in Singapore to 20 globally. Back then, each one of us was busy shipping features and had no time to set up a proper automation process. -
Engineering
A Key Expired in Redis, You Won't Believe What Happened Next
One of Grab's more popular caching solutions is Redis (often in the flavour of the misleadingly named ElastiCache), and for most cases, it works. Except for that time it didn't. Follow our story as we investigate how Redis deals with consistency on key expiration. -
Engineering
How Grab Hires Engineers in Singapore
Working at Grab will be the “most challenging yet rewarding opportunity” any employee will ever encounter. -
Engineering
Battling with Tech Giants for the World's Best Talent
Grab steadily attracts a diverse set of engineers from around the world in its three R&D centres: Singapore, Seattle, and Beijing. Right now, half of Grab’s top leadership team is made up of women and we have attracted people from five continents to work together on solving the biggest challenges for Southeast Asia. -
Engineering
This Rocket Ain't Stopping - Achieving Zero Downtime for Rails to Golang API Migration
Grab has been transitioning from a Rails + NodeJS stack to a full Golang Service Oriented Architecture. To contribute to a single common code base, we wanted to transfer engineers working on the Rails server powering our passenger app APIs to other Go teams. -
Engineering
Round-robin in Distributed Systems
While working on Grab's Common Data Service (CDS), there was the need to implement client side load balancing between CDS clients and servers. However, I kept encountering persistent connection issues with Elastic Load Balance (ELB). -
Engineering
Programmers Beware - UX is Not Just for Designers
Perhaps one of the biggest missed opportunities in Tech in recent history is UX. Somehow, UX became the domain of Product Designers and User Interface Designers. While they definitely are the right people to be thinking about web pages, mobile app screens and so on, we've missed a huge part of what we engineers work on every day: SDKs and APIs. -
Engineering
Grab You Some Post-Mortem Reports
Grab adopts a Service-Oriented Architecture (SOA) to rapidly develop and deploy new feature services. One of the drawbacks of such a design is that team members find it hard to help with debugging production issues that inevitably arise in services belonging to other stakeholders. -
Engineering
The Curious Case of the Phantom Instance
Here at the Grab Engineering team, we have built our entire backend stack on top of Amazon Web Services (AWS). Over time, it was inevitable that some habits have started to form when perceiving our backend monitoring statistics.