-
Engineering · Data Science
LLM-assisted vector similarity search
Vector similarity search has revolutionised data retrieval, particularly in the context of Retrieval-Augmented Generation in conjunction with advanced Large Language Models (LLMs). However, it sometimes falls short when dealing with complex or nuanced queries. In this post, we explore our experimentation with a simple yet effective approach to mitigate this shortcoming by combining the efficiency of vector similarity search with the contextual understanding of LLMs. -
Engineering · Analytics · Data Science
Leveraging RAG-powered LLMs for Analytical Tasks
The emergence of Retrieval-Augmented Generation (RAG) has significantly revolutionised Large Language Models (LLMs), propelling them to unprecedented heights. This development prompts us to consider its integration into the field of Analytics. Explore how Grab harnesses this technology to optimise our analytics processes. -
Engineering · Data Science
Evolution of Catwalk: Model serving platform at Grab
Read about the evolution of Catwalk, Grab's model serving platform, from its inception to its current state. Discover how it has evolved to meet the needs of Grab's growing machine learning model serving requirements. -
Engineering · Data Science
LLM-powered data classification for data entities at scale
With the advent of the Large Language Model (LLM), new possibilities dawned for metadata generation and sensitive data identification at Grab. This prompted the inception of our project aimed to integrate LLM classification into our existing data management service. Read to find out how we transformed what used to be a tedious and painstaking process to a highly efficient system and how it has empowered the teams across the organisation. -
Data Science · Engineering · Security
Ensuring data reliability and observability in risk systems
As the amount of data Grab handles grows, there is an increased need for quick detections for data anomalies (incompleteness or inaccuracy), while keeping it secure. Read this to learn how the Risk Data team utilised Flink and Datadog to enhance data observability within Grab’s services. -
Engineering · Data Science
Grab Experiment Decision Engine - a Unified Toolkit for Experimentation
Explore how the GrabX Decision Engine, an integral part of Grab's Experimentation platform, streamlines the testing of thousands of experimental variants weekly. This blog delves into how this internally open-sourced package institutionalises best practices in experimental efficiency and analytics, thereby ensuring accurate and reliable conclusions from each experiment. -
Engineering · Data Science
Iris - Turning observations into actionable insights for enhanced decision making
With cross-platform monitoring, a common problem is the difficulty in getting comprehensive and in-depth views on metrics, making it tough to see the big picture. Read to find out how the Data Tech team ideated Iris to turn observations into actionable insights for enhanced decision-making. -
Engineering · Data Science
Enabling near real-time data analytics on the data lake
As the data lake landscape matures over the years, it presents opportunities to unlock more business value from the data. This correlates with the increased demand for flexible ad-hoc usage of fresh data. This article explores how we implemented data ingestion in Hudi table formats using Flink to meet this business demand. -
Engineering · Data Science
Kafka on Kubernetes: Reloaded for fault tolerance
Dive into this insightful post to explore how Coban, Grab's real-time data streaming platform, has drastically enhanced the fault tolerance on its Kafka on Kubernetes design, to ensure seamless operation even amid unexpected disruptions. -
Engineering · Data Science · Product
An elegant platform
Supporting real-time data streaming enables our internal users to build intelligent applications and services, a crucial aspect of continuously out-serving our community. Read this article to understand our journey of building a real-time data streaming platform from pure Infrastructure-as-Code towards a more sophisticated control plane, and the benefits of this solution. -
Engineering · Data Science · Product
Road localisation in GrabMaps
With GrabMaps powering the Grab superapp we have the opportunity to improve our services and enhance our map with hyperlocal data. No matter the use case, road localisation plays an important role in Grab’s map-making process. However, road localisation entails handling a substantial volume of data, making it a costly and time-consuming endeavour. In this article, we explore the strategies we have implemented to drive down costs and reduce processing times associated with road localisation. -
Engineering · Data Science
Scaling marketing for merchants with targeted and intelligent promos
Apart from ensuring advertisements reach the right audience, it is also important to make promos by merchants more targeted and intelligent to help scale their marketing. With Grab’s innovative AI tool, merchants can boost sales while cutting costs. Dive into this game-changing tool that’s reshaping the future of marketing and find out how the Data Science team at Grab used automation and made promo assignments a more seamless and intelligent process. -
Engineering · Data Science
Stepping up marketing for advertisers: Scalable lookalike audience
A key challenge in advertising is reaching the right audience who are most likely to use your product. Read this article to find out how the Data Science team improved advertising effectiveness by using lookalike audiences to identify individuals who share similar characteristics with an existing consumer base. -
Engineering · Data Science · Product
Building hyperlocal GrabMaps
Being hyperlocal is a key advantage for GrabMaps. In this article we will explain what being hyperlocal means and how it helps GrabMaps bring value to our driver-partners and passengers through the Grab platform. -
Data Science · Security
Unsupervised graph anomaly detection - Catching new fraudulent behaviours
As fraudsters continue to evolve, it becomes more challenging to automatically detect new fraudulent behaviours. At Grab, we are committed to continuously improving our security measures and ensuring our users are protected from fraudsters. Find out how Grab’s Data Science team designed a machine learning model that has the ability to discover new fraud patterns without the need for label supervision. -
Engineering · Security · Data Science
Graph service platform
Graphs are powerful data representations that detect relationships and data linkages between devices and help reveal fraudulent or malicious users. Learn how GrabDefence built the graph service platform to help discover potentially malicious data linkages. -
Engineering · Data Science · Security
Graph for fraud detection
Fraud detection has become increasingly important in a fast growing business as new fraud patterns arise when a business product is introduced. We need a sustainable framework to combat different types of fraud and prevent fraud from happening. Read and find out how we use graph-based models to protect our business from various known and unknown fraud risks. -
Engineering · Data Science
Query expansion based on user behaviour
User behaviour data is a gold mine to gain insights about users and help us improve user experience. In this blog, we explore a query expansion framework based on user rewrite behaviour and how it improves user search experience and conversion. -
Engineering · Data Science · Security
Using mobile sensor data to encourage safer driving
Telematics is most commonly used to monitor vehicle movements and track driving safety, profiling, fleet optimisation and possible productivity improvements. Read this to find out more about how Grab uses telematics to encourage safer driving across our driver and delivery partner fleet. -
Engineering · Data Science
Automatic rule backtesting with large quantities of data
At Grab, real-time fraud detection is built on a rule engine. As data scientists and analysts, we need to analyse and simulate a rule on historical data to check the performance and accuracy of the rule. Backtesting, also known as Replay, enables analysts to run simulations of either newly-invented rules, or evaluate the performance of existing rules using past events ranging from days to months, and significantly improve rule creation efficiency. -
Engineering · Data Science
How we store and process millions of orders daily
The Grab Order Platform is a distributed system that processes millions of GrabFood or GrabMart orders every day. Learn about how the Grab order platform stores food order data to serve transactional (OLTP) and analytical (OLAP) queries. -
Engineering · Security · Data Science
Graph Networks - 10X investigation with Graph Visualisations
As fraud schemes get more complex, we need to stay one step ahead by improving fraud investigation methods. Read to find out more about graph visualisation, why we need it and how it helps with uncovering patterns and relationships. -
Engineering · Security · Data Science
How facial recognition technology keeps you safe
Facial recognition technology has grown tremendously in recent years due to the rise of deep learning techniques and accelerated digital transformation. Read to find out more about facial recognition technology in Grab and the components that help keep you safe. -
Engineering · Data Science
Automated Experiment Analysis - Making experimental analysis scalable
Analysts and data scientists invest lots of time into creating trustworthy experiments, which are key to making sound decisions. Read to find out how Automated Experiment Analysis helps make experimental analysis more scalable. -
Engineering · Data Science
How telematics helps Grab to improve safety
Coupled with data science, telematics can help to detect traffic events such as harsh braking and unsafe lane changes so we can provide a safer experience for our users. Read on to find out more about the challenges faced and how we addressed them with telematics. -
Engineering · Data Science
Real-time data ingestion in Grab
When it comes to data ingestion, there are several prevailing issues that come to mind: data inconsistency, integrity and maintenance. Find out how the Caspian team leveraged real-time data ingestion to help address these pain points. -
Data Science
Using real-world patterns to improve matching in theory and practice
Find out how real-world patterns can be used to improve algorithm performance when performing bipartite matching for passengers and driver-partners. -
Engineering · Data Science
Securing and Managing Multi-cloud Presto Clusters with Grab’s DataGateway
This blog post discusses how Grab's DataGateway plays a key role in supporting hundreds of users in our entire Presto ecosystem - from managing user access, cluster selection, workload distribution, and many more. -
Engineering · Data Science
The Journey of Deploying Apache Airflow at Grab
This blog post shares how we designed and implemented an Apache Airflow-based scheduling and orchestration platform for teams across Grab. -
Data Science
Does Southeast Asia Run on Coffee?
This blog post shares insights on GrabFood data around how much our fellow Southeast Asians love coffee. -
Data Science
GrabChat Much? Talk Data to Me!
This blog post uncovers some interesting insights from our GrabChat data in Singapore, Malaysia, and Indonesia. -
Data Science
7 Fun Facts about Grab’s Driver-Partners in Singapore
This blog post shares the most interesting data points from 2019 about our Singapore driver-partners. -
Data Science · Engineering
Data First, SLA Always
Introducing Trailblazer, the Data Engineering team’s solution to implementing change data capture of all upstream databases. In this article, we introduce the reason why we needed to move away from periodic batch ingestion towards a real time solution and show how we achieved this through an end to end streaming pipeline. -
Data Science · Product
Making Grab’s Everyday App Super
To excel in a heavily diversified market like Southeast Asia, we leverage on the depth of our data to understand what sorts of information users want to see on our Feed and when they should see them. In this article we will discuss Grab Feed’s recommendation logic and strategies, as well as its future roadmap. -
Engineering · Data Science
Catwalk: Serving Machine Learning Models at Scale
This blog post explains why and how we came up with a machine learning model serving platform to accelerate the use of machine learning in Grab. -
Data Science
Tourists on GrabChat!
Just over two years ago we introduced GrabChat, Southeast Asia’s first of its kind in-app messaging platform. Since then we’ve added all sorts of useful features to it such as auto-translated messages, the ability to send photos, and even voice messages! It’s been a great tool to facilitate smoother communications between our driver-partners and our passengers, and one group in particular has found it incredibly useful: tourists! -
Data Science
Bubble Tea Craze on GrabFood!
Bubble Tea’s popularity on GrabFood has captured our attention and we want to celebrate its fascinating growth with you! We have deep-dived into Grab’s Big Data to find the most interesting bubbles of treasures that can excite your palette. Hopefully these insights can help you understand what’s behind the Bubble Tea craze in GrabFood in Southeast Asia! -
Data Science
How We Harnessed the Wisdom of Crowds to Improve Restaurant Location Accuracy
We questioned some of the estimates that our algorithm for calculating restaurant wait times was making, and found that the "errors" were actually useful to discover restaurants whose locations had been incorrectly registered in our system. By combining such error signals across multiple orders, we were able to identify correct restaurant locations and amend them to improve the experience for our consumers. -
Data Science · Engineering · Product · Design
Recipe for Building a Widget: How We Helped to “Peak-Shift” Demand by Helping Passengers Understand Travel Trends
We help to “peak-shift” demand by helping passengers understand travel trends with Grab’s data. Curious to know how we empower our passengers to make better travel decisions? Read on! -
Data Science
Understanding Supply & Demand in Ride-hailing Through the Lens of Data
Grab aims to ensure that our passengers can get a ride conveniently while providing our drivers better livelihood. To achieve this, balancing demand and supply is crucial. This article gives you a glimpse of one of our analytics initiatives - how to measure the supply and demand ratio at any given area and time. -
Data Science
Journey of a Tourist via Grab
Grab's services to tourists are an integral part of connecting tourists to various destinations and attractions. Do tourists travel on Grab to outlandishly fancy places like those you see in the movie "Crazy Rich Asians"? What are their favourite local places? Did you know that Grab's data reveals that medical tourism is growing in Singapore? Here are some exciting travel patterns that we found about our tourists' Grab rides in Singapore! -
Data Science
Grab Senior Data Scientist Liuqin Yang Wins Beale-Orchard-Hays Prize
Grab Senior Data Scientist Dr. Liuqin Yang wins the 2018 Beale-Orchard-Hays Prize, the highest honor in Computational Mathematical Optimization. He has been recognised for his paper and the corresponding software SDPNAL+. -
Data Science
GrabShare at the Intelligent Transportation Engineering Conference
We're excited to share the publication of our paper GrabShare: The Construction of a Realtime Ridesharing Service, which was Grab's contribution to the Intelligent Transportation Engineering Conference in Singapore last month. -
Data Science
The Data and Science Behind GrabShare Part I: Verifying Potential and Developing the Algorithm
Launching GrabShare was no easy feat. After reviewing the academic literature, we decided to take a different approach and build a new matching algorithm from the ground up. -
Data Science · Product
How to Go from a Quick Idea to an Essential Feature in Four Steps
How do you work within a startup team and build a quick idea into a key feature for an app that impacts millions of people? It's one of those things that is hard to understand when you just graduate as an engineer.