Climbing the Summit

Stories and Insights about Building Grammarly

  • How do you know if your proofreading algorithm is doing a good job? So far, the NLP community has used the standard of “minimal edit corrections,” i.e., the minimal number of edits to make a sentence grammatically correct. However, the problem with this approach is that a grammatically correct sentence doesn’t always sound natural to a native speaker. For the past two years, we—Joel Tetreault, Courtney Napoles, and Keisuke Sakaguchi—have been tackling this problem. Joel is Grammarly’s Director of Research, and Courtney and Keisuke are both Ph.D. students at Johns Hopkins Center for Language and Speech Processing.

    Joel Tetreault, Courtney Napoles, Keisuke Sakaguchi March 31, 2017 nlp, machine learning, corpus, proofreading
  • As discussed in the first part of this series, we were very excited when we figured out how to properly build Docker images, until we realized that we had no idea how to run them in production. You might have already guessed that we were pondering building our own tool.

    Yuriy Bogdanov September 9, 2015 infrastructure, platform, open source
  • Today, the industry is saturated with discussions about containers. Many companies are looking for ways they can benefit from running an immutable infrastructure or simply boost development performance by making repeatable builds between environments simpler. However, sometimes by simplifying the user experience we end up complicating the implementation. On our journey to a usable, containerized infrastructure, we faced a number of daunting challenges, the solutions to which are the subject of this post. Welcome to the bleeding edge!

    Yuriy Bogdanov September 7, 2015 infrastructure, platform, open source
  • At Grammarly, the foundation of our business, our core grammar engine, is written in Common Lisp. It currently processes more than a thousand sentences per second, is horizontally scalable, and has reliably served in production for almost 3 years.

    We noticed that there are very few, if any, accounts of how to deploy Lisp software to modern cloud infrastructure, so we thought that it would be a good idea to share our experience. The Lisp runtime and programming environment provides several unique, albeit obscure, capabilities to support production systems (for the impatient, they are described in the final chapter).

    Vsevolod Dyomkin June 26, 2015 lisp, infrastructure, debugging
  • In this post, we are going to discuss a common evolution of server-side architecture that many growing companies face. It is a now-legendary transition from a monolithic application to a micro-services architecture. And although decoupling is a sound software development concept, there are a number of risks, and pain points associated with it. This writeup covers some of the issues we faced while scaling Grammarly’s server backend and the solutions and insights that we had in the process.

    Stas Kravets April 24, 2015 introductory, architecture
  • The task of comparing constituency parsers is not a trivial one. Parsers vary in the types of mistakes they make, types of texts they are good at parsing, speed, and all kinds of interesting features and quirks within each implementation. We set out to understand what stands behind the vague F-measure numbers lurking around 90% and what kind of issues to expect from different parsers, regardless of their overall quality.

    Mariana Romanyshyn, Vsevolod Dyomkin November 3, 2014 nlp, open source
  • At Grammarly, we use a lot of off-the-shelf core NLP technologies to help us make a little bit of sense in the mess that is natural language texts (English in particular). The issue with all these technologies is that even small errors in their output are often multiplied by the downstream algorithms. So, when a sophisticated mistake-detection algorithm is supposed to work on individual sentences, but it receives a fragment of a sentence or a couple merged together, it may find all sorts of funny things inside.

    Oleksii Sliusarenko, Vsevolod Dyomkin April 22, 2014 nlp