11 Dec 2021

Creating a Routing Algorithm for Singapore's Public Transport

Oh dear, how ambitious was I?

Overview

So about 2 years ago, while thinking of a topic for my IB Computer Science project, I came up with a crazy idea to write a simplified Google Maps specifically for Singapore’s public transport system. This was shortly after I competed in the 2019 National Olympiad in Informatics in which I attained a Bronze award. I must’ve been high or something as I decided to try this insane seemingly impossible project as my final graded project. I coded a terribly unoptimized (but still relatively decent) version in Java, pulling off what I thought was entirely impossible.

After my exams, the inevitability of conscription to the Singapore army struck me and I decided that I wouldn’t let my brain rot in the military. Thus, my half drunk brain somehow found the motivation to redevelop the same project from scratch without the front end. I named the project SGRouter

Disclaimer: I assume you already have some knowledge on Dijkstra’s SSSP algorithm and a pretty good knowledge on coding to embark on reading this post which probably leads to insanity.

A lot of the same content and more info can be found in the GitHub Wiki. The Wiki also contains how I combined data to create the graph in the form of a SQLite Database.

Background

Initially, I converted my C++ Dijkstra template to Java. This template was adapted from a GeeksForGeeks post which implemented using STL PriorityQueue.

#define INF 1e9                  //Infinity
const int sz = 100001;           //Maximum possible number of vertices
vector<pair<int, int>> a[sz];    //Adjacency list
uint64_t dis[sz];                //Stores shortest distance
bool vis[sz] = {0};              //Determines whether the node has been visited or not
void Dijkstra(int source, int n) //Algorithm for SSSP (Source, Num vertices)
{
  for (int i = 0; i < n; i++)
  { //Set initial distances to Infinity
    dis[i] = INF;
    vis[i] = false;
  }
  //Comparator for priority queue
  class prioritize
  {
  public:
    bool operator()(pair<int, uint64_t> &p1, pair<int, uint64_t> &p2)
    {
      return p1.second > p2.second;
    }
  };
  //Priority queue to store vertex,weight pairs
  priority_queue<pair<int, uint64_t>, vector<pair<int, uint64_t>>, prioritize> pq;
  //Pushing the source with distance from itself as 0
  pq.push(make_pair(source, dis[source] = 0));
  while (!pq.empty())
  {
    pair<int, uint64_t> curr = pq.top(); //Current vertex
    pq.pop();
    int cv = curr.first;
    uint64_t cw = curr.second; //'cw' the final dist for vertex
    if (vis[cv])               //If vertex visited, continue
      continue;
    vis[cv] = true;
    //cout << "Adjacent List Size: " << a[cv].size() << endl;
    //Iterating through all adjacent vertices
    for (int i = 0; i < a[cv].size(); i++)
    {
      //If not visited & distance to node through parent < actl distance,
      //Update node
      if (!vis[a[cv][i].first] && a[cv][i].second + cw < dis[a[cv][i].first])
      {
        //Set the new distance and add to priority queue
        pq.push(make_pair(a[cv][i].first, (dis[a[cv][i].first] = a[cv][i].second + cw)));
        //cout << "Set " << a[cv][i].first << " as " << a[cv][i].second+cw << endl;
      }
    }
  }
}

Differences in Problems

The major difference is the fact that it is a K-Shortest path problem. However, there are other variables that need to be considered:

Each node can only be identified by an ID (5-digit numbers for bus stops and 4-digit alphanumeric codes for train stations)
Adjacency list is stored in SQLite database
Different services have to be accounted for

Changes to Dijkstra’s SSSP Algorithm

Due to the above restrictions, the following modifications were made

Adjacency List

Adjacency lists were retrieved using an SQL statement:

SELECT * FROM edge WHERE src=current_node AND service<>previous_service

Visited

The visited array was changed into a HashMap<String,VisitedState>

The VisitedState class stores: int walks, HashSet<String> services. walks counts the number of times “walking” is used to pass through that node, while services ensures each service only passes through the node once. Using a helper bool visited() method, a node is considered visited when:

The sum of distinct public transit services and number of times walked >= 3
The non-walking service passing has already visited the node and, thus, is already added to the services set

End Condition

Since the Dijkstra algorithm explores the shortest path, once the current node is the destination node, the loop may terminate, adding a new route. Once this occurs K=3 times (or if the priority queue is empty), the entire algorithm may return.

K-Shortest and States

As detailed in the K-Shortest pseudo-code on Wikipedia, the entire path of nodes should be stored in the state which is stored in the priority queue. Thus, the priority queue stores instances of the RouteState class which has the following parameters.

String src; //Used for cv (Current Vertex)
String prevService; //For adj list querying
double time; //Store total time for easy access
HashSet<String> traversedNodes; //Discussed in Optimizations
HashSet<String> walked; //Discussed in Optimizations
ArrayList<SubRoute> path; //Storage of path; Required for k-shortest

Optimizations

Walking

Walking is exempted from the previous service exclusion in visited. To prevent infinite loops, the program ensures only certain “types” of walking are taken in succession. The graph_builder_service stores walking vertices with the service Walk (<type>). Types include: Interchange, Bus-Train, Train-Bus, Train-Exit and Exit-Train. These services are stored in the walked HashSet in RouteState. When querying for the adjacency list, these types of walking services stored int he set are excluded.

Traversed Nodes

To further enhance the visited method, each RouteState stores a set of each node traversed, acting as the original visited array in Dijkstra’s algorithm. Though memory inefficient, it ensures infinite loops do not occur and reduces the runtime to search.

Exit Condition

An extra exit condition for a RouteState is done if the current time taken on a path is more than the 3rd fastest time among all threads. This is described in detail below.

Multithreading

To ensure the routing is fast, for each src and des pair, a thread is created. The total execution time is also limited using Java’s ExecutorService. When a path is found, it is added synchronously to a static array of Route. This allows for further optimization in the original Dijkstra code. A sub-route may now be cancelled if the current time is more than the 3rd fastest route.

Conclusion

So that was my explanation on my modified routing algorithm. Obviously there were many points which could’ve optimized it further. One of which is to use a spatial index to find the nearest nodes instead of a brute force search.

The GitHub repository contains the above code.