Weekly updates of the project - Transformer and BERT in mlpack
Author: mrityunjay-tripathi
Mon, June 29
Hello, everyone! For this week, I worked mostly on PR #2404. Trying out different things to fix memory errors. The memory errors exist in upstream also and I got to know that there is already an issue open regarding this.
This week I also implemented the BLEU Score metric (PR #2477), which is used for evaluating the quality of text generated. This PR is also ready from my side.
The MultiheadAttention class (PR #2375) was expected to be completed by now but some complexities invloved in it is acting as bottleneck. There are two things I can do; either implement it for batchSize = 1 as the Lookup layer is implemented or implement for Single-headed Attention for now.
For the next week I am thinking to devote my time to get the Multihead Attention class working even if in simpler form. I have my online end semester exams this week but fortunately it's just two hours open-book test.
See you next time. Be Healthy! Be Safe!