Creating a Routing Algorithm for Singapore's Public Transport
Oh dear, how ambitious was I?
Overview
So about 2 years ago, while thinking of a topic for my IB Computer Science project, I came up with a crazy idea to write a simplified Google Maps specifically for Singapore’s public transport system. This was shortly after I competed in the 2019 National Olympiad in Informatics in which I attained a Bronze award. I must’ve been high or something as I decided to try this insane seemingly impossible project as my final graded project. I coded a terribly unoptimized (but still relatively decent) version in Java, pulling off what I thought was entirely impossible.
After my exams, the inevitability of conscription to the Singapore army struck me and I decided that I wouldn’t let my brain rot in the military. Thus, my half drunk brain somehow found the motivation to redevelop the same project from scratch without the front end. I named the project SGRouter
Disclaimer: I assume you already have some knowledge on Dijkstra’s SSSP algorithm and a pretty good knowledge on coding to embark on reading this post which probably leads to insanity.
A lot of the same content and more info can be found in the GitHub Wiki. The Wiki also contains how I combined data to create the graph in the form of a SQLite Database.
Background
Initially, I converted my C++ Dijkstra template to Java. This template was adapted from a GeeksForGeeks post which implemented using STL PriorityQueue.
#define INF 1e9 //Infinity
const int sz = 100001; //Maximum possible number of vertices
vector<pair<int, int>> a[sz]; //Adjacency list
uint64_t dis[sz]; //Stores shortest distance
bool vis[sz] = {0}; //Determines whether the node has been visited or not
void Dijkstra(int source, int n) //Algorithm for SSSP (Source, Num vertices)
{
for (int i = 0; i < n; i++)
{ //Set initial distances to Infinity
dis[i] = INF;
vis[i] = false;
}
//Comparator for priority queue
class prioritize
{
public:
bool operator()(pair<int, uint64_t> &p1, pair<int, uint64_t> &p2)
{
return p1.second > p2.second;
}
};
//Priority queue to store vertex,weight pairs
priority_queue<pair<int, uint64_t>, vector<pair<int, uint64_t>>, prioritize> pq;
//Pushing the source with distance from itself as 0
pq.push(make_pair(source, dis[source] = 0));
while (!pq.empty())
{
pair<int, uint64_t> curr = pq.top(); //Current vertex
pq.pop();
int cv = curr.first;
uint64_t cw = curr.second; //'cw' the final dist for vertex
if (vis[cv]) //If vertex visited, continue
continue;
vis[cv] = true;
//cout << "Adjacent List Size: " << a[cv].size() << endl;
//Iterating through all adjacent vertices
for (int i = 0; i < a[cv].size(); i++)
{
//If not visited & distance to node through parent < actl distance,
//Update node
if (!vis[a[cv][i].first] && a[cv][i].second + cw < dis[a[cv][i].first])
{
//Set the new distance and add to priority queue
pq.push(make_pair(a[cv][i].first, (dis[a[cv][i].first] = a[cv][i].second + cw)));
//cout << "Set " << a[cv][i].first << " as " << a[cv][i].second+cw << endl;
}
}
}
}
Differences in Problems
The major difference is the fact that it is a K-Shortest path problem. However, there are other variables that need to be considered:
- Each node can only be identified by an ID (5-digit numbers for bus stops and 4-digit alphanumeric codes for train stations)
- Adjacency list is stored in SQLite database
- Different services have to be accounted for
Changes to Dijkstra’s SSSP Algorithm
Due to the above restrictions, the following modifications were made
Adjacency List
Adjacency lists were retrieved using an SQL statement:
SELECT * FROM edge WHERE src=current_node AND service<>previous_service
Visited
The visited array was changed into a HashMap<String,VisitedState>
The VisitedState
class stores: int walks
, HashSet<String> services
. walks
counts the number of times “walking” is used to pass through that node, while services
ensures each service only passes through the node once. Using a helper bool visited()
method, a node is considered visited when:
- The sum of distinct public transit services and number of times walked >= 3
- The non-walking service passing has already visited the node and, thus, is already added to the
services
set
End Condition
Since the Dijkstra algorithm explores the shortest path, once the current node is the destination node, the loop may terminate, adding a new route. Once this occurs K=3 times (or if the priority queue is empty), the entire algorithm may return.
K-Shortest and States
As detailed in the K-Shortest pseudo-code on Wikipedia, the entire path of nodes should be stored in the state which is stored in the priority queue. Thus, the priority queue stores instances of the RouteState
class which has the following parameters.
String src; //Used for cv (Current Vertex)
String prevService; //For adj list querying
double time; //Store total time for easy access
HashSet<String> traversedNodes; //Discussed in Optimizations
HashSet<String> walked; //Discussed in Optimizations
ArrayList<SubRoute> path; //Storage of path; Required for k-shortest
Optimizations
Walking
Walking is exempted from the previous service exclusion in visited. To prevent infinite loops, the program ensures only certain “types” of walking are taken in succession. The graph_builder_service stores walking vertices with the service Walk (<type>)
. Types include: Interchange
, Bus-Train
, Train-Bus
, Train-Exit
and Exit-Train
. These services are stored in the walked
HashSet in RouteState
. When querying for the adjacency list, these types of walking services stored int he set are excluded.
Traversed Nodes
To further enhance the visited method, each RouteState
stores a set of each node traversed, acting as the original visited array in Dijkstra’s algorithm. Though memory inefficient, it ensures infinite loops do not occur and reduces the runtime to search.
Exit Condition
An extra exit condition for a RouteState
is done if the current time taken on a path is more than the 3rd fastest time among all threads. This is described in detail below.
Multithreading
To ensure the routing is fast, for each src and des pair, a thread is created. The total execution time is also limited using Java’s ExecutorService
. When a path is found, it is added synchronously to a static array of Route
. This allows for further optimization in the original Dijkstra code. A sub-route may now be cancelled if the current time is more than the 3rd fastest route.
Conclusion
So that was my explanation on my modified routing algorithm. Obviously there were many points which could’ve optimized it further. One of which is to use a spatial index to find the nearest nodes instead of a brute force search.
The GitHub repository contains the above code.