QANet Implementation

2018

An open-source implementation of the Neural-network QANet, which was then the number one network for question answering.
Cover Image
Demo UI GIF
Input screen
Prediction screen

Cover Image

1 / 4

QANet was a top-performing Question Answering Network on the SQUAD dataset until it was surpassed by BERT. I created my own version using the original paper as there was no open-source implementation available. My implementation achieved the same results as the paper and even exceeded it with the addition of contextual embeddings and part-of-speech tags. Features include pre-trained checkpoints, options for resource-efficient training, and a live demo.

What I Did

- Built out a full replication of the original QANet paper which, when trained matches the paper's performance in some aspects and exceeds it in others.

- Designed and built a fully functioning demo that visualises the decision process of the network by highlighting the regions of the given source text it focuses on while deducing what the answer to a question could be.

- Extended my replication of the paper to include contextual embeddings from models like ElMo, further increasing the performance on the squad dataset to the extent that a one head attention model surpases the results of an 8 head model.

Results

While building this project, there were points that I thought "I've bitten off more than I can chew" but after revisiting the source code and fixing bugs in the demo, I'm proud of what I did. My focus was on making the project user-friendly, as many research models can be limited by the fact that their core functions only exist in Jupyter notebooks or are randomly distributed in undocumented sections amongst the code.

To address this, I added a comprehensive set of command-line options, allowing for easy modification of all parameters from the terminal. I also expanded the original implementation by including my own experiments, such as contextual embeddings, and a full-fledged demo UI, which presented additional challenges but ultimately enhanced the overall project."