Automating Duolingo by reverse engineering the Duolingo Android app: Introducing Duobot

Oct 14, 2024 · mc51

I automated Duolingo by reverse engineering the Duolingo Android app. The result is the open-source Duobot. It is a complete and working command line automation for Duolingo written in Python.

Context

Duolingo is really good at keeping its users engaged through elaborate gamification. And I am one of those users. My spanish hasn’t improved much but who cares? Check out my impressive streak instead:

Obviously, keeping that streak alive is now one of my main life goals. But there have been too many instances of me doing exactly this:

Figure 2. It's 23:55 and I forgot to do my Duolingo lessons. Again

So, this is not really about learning a language. It’s more about not getting murdered by Duo for skipping my lessons. Hence, the questions becomes: Can I automate the Duolingo App to beat the system?
Answer: Yes.
And I’m nice enough to share how it’s done with you. Therefore, let’s dive into how to automate the Duolingo Android app by reverse engineering it!

Solution and Demo

Let’s put the cart before the horse and start with the solution. The result of my efforts to automate Duolingo is Duobot:

Figure 3. Duobot is not AI powered

Duobot is a complete and working command line automation for the Duolingo app written in Python. It is open-source, so go ahead and check out the repo (and show some love by starring it). Here’s a quick demo of it’s capabilities:

Figure 4. Duobot at work

What Duobot does is: It travels the learning path of your currently active language course for you. Field by field it solves lessons, solves stories, and opens chests. Moreover, it collects XP and gems, improves your league position, contributes to daily quests and keeps your streak alive.

Approach

For those remaining of you who are interested in how I built Duobot, here’s a overview of the approach.

In order to automate the Duolingo App, we need to understand how it works under the hood. Like most modern apps the app is merely a client. It’s communicating with a server which does most of the computation and serves the main program logic. If we can understand the underlying API of the server, we might be able to replicate and automate what the client (i.e. app) is doing. Consequently, the first step is to reverse engineer the API. To do so, we first need to be able to read the traffic between client and server. Luckily, there are methods for achieving that even for secured Android apps and I’ve written about that before. Here’s some of the traffic of the Duolingo app captured by HTTPToolkit:

Figure 5. Captured traffic from Duolingo Android app

There’s a lot going on here. Data is being shared with numerous external servers. In order to keep it somewhat manageable we need to filter for hosts under the duolingo.com domain. But even then, there’s a plethora of requests. Therefore, we need to further differentiate between relevant and distracting requests. Fortunately, the endpoint names are quite helpful in doing that. In addition, the (json) responses use a well structured schema and mostly conclusive naming.
To be able to build an automation for the app, we need to be able to understand and then replicate at least the following:

Login and Authorization of requests
Getting the current status of:
- The user (XP, gems, league, etc.)
- The current language course (languages, position, etc.)
- The challenges in the next lesson
Understanding how:
- Lessons and stories are started and finished
- Chests are opened
- XP, gems and progress are awarded

Now, we need to experiment quite a bit in order to understand how these work. A good strategy is to break actions down as much as possible. Then, to observe how they map to requests and responses of the API. It turns out, that while quite a few actions are trivial (e.g. getting a user status) some others are quite complex (e.g. solving the numerous challenge types in a lesson). In the end, it’s a lot of trial and error. Fortunately, the Duolingo API is quite logical. Consequently, many aspects can be deduced. Still, there will always be uncertainty around details. For some requests, e.g. posting the answers to the lesson challenges, there’s abundant data to be computed and returned. Does the endpoint expect all of it? Can we get every single value right? What about the order and timing of requests? We can’t really know. But if we don’t properly replicate the original requests it simply won’t work. Or even worse, our traffic might be flagged as suspicious and our account banned.

If you are interested in the details of the approach check out the Python code itself.