An online code compiler emulates the terminal on the web allowing you to type terminal commands, your code etc. and compile them there and then as you’d normally do on your computer. Basically, there’s a terminal backend on the virtual machine/container that exposes the terminal on the web using one of the several libraries. Inherently, both sides communicate using websocket. The traditional model of web communication does not work in this situation since you need something that allows two-way interactive communication (mostly, to handle the standard input — ‘stdin’).
Have you thought about why most of the online compilers do not allow you to give input? I mean, there are online compilers that allow you to type your code and run it, but they don’t accept input thereafter. Suppose you made a C program that accepts input from user. You can compile & run it online. But, you won’t be able to pass it input when it asks for one because the online compilers restrict you to do so.
Why? Because they don’t use websocket. If you don’t use websocket, the only way to achieve something like this is to regularly poll the program to see if its waiting for the standard input. If it is, then you’ll send a response to the web user asking for the input that the program is waiting for. But that’s easier said than done, because polling is not at all efficient, and secondly, its complex to write it. The other reason online compilers do not allow input is related with security.
Websocket is meant for interactive duplex communication. And there are already libraries that help you emulate the whole terminal over on the web via websocket. That means you just show the whole terminal to the user and then the python script, or c code or whatever the user wanted to compile, would be compiled and ran on the same terminal. User would be able to interact with it and enter the input as well — problem solved!
What about security? Wouldn’t users have access of a terminal session that could possibly bring down the VPS/Virtual machine and wreck havoc (Remember we’re just exposing the terminal on the web, but it IS RUNNING on the VPS/Virtual Machine actually — that means the users have access to a terminal session of the virtual machine) ? Yes, that is what happens if you don’t get smart about it.
A simple way to avoid the problem is to use a sandbox. Meaning, you don’t expose the actual terminal. You expose a terminal that is run inside a sandbox. So whatever the user does on the exposed terminal will have no impact whatsoever on the system that is outside the sandbox, and better yet, the sandbox programs already restrict the user from doing a lot of actions on terminal. For example, the sandbox program I have used called Firejail not only has a lot of restrictions in place, but also allows users to add their own restrictions.
So here’s how I’ve structured the project:
- Using Ace editor, the user can type the codes.
- When he clicks on RUN, the code gets saved on the virtual machine. I’ve used REST API for this purpose, but using websocket for this would be better — websocket is comparatively faster than REST (in this case).
- Then, a terminal session is initiated inside a sandbox (firejail). It is exposed to the web via terminado.
- The initiated terminal session is not bash, but restricted bash aka rbash. Rbash has already many built in restrictions in place. For example, it does not allow you to type ‘/’ or change directory (‘cd’). I did this just to make it extra secure — a restricted bash inside a sandbox that is already restricted.
- There’s a unique terminal for every user. So, when a user quits, the sandbox also gets shut down.
- As it is with nginx, the user is automatically disconnected / connection is halted if there’s no activity for ~30 seconds i.e. Terminal exits in case of no activity for 30 seconds.
- With the help of cron job, I’ve set in further two restrictions: terminal session is closed after 10 minutes and any program running longer than 1 minute in the terminal is killed automatically (to handle programs taking too much memory because of infinite loops etc.).
- I’ve used monit, a system monitoring utility to monitor resource/memory usage of the application and restart the API, Nginx, and the major application in case the usage exceeds a sudden threshold. This is required in case the resources are hogged when there are several users online compiling at the same time.
I am thinking of writing a more detailed step-by-step guide on how to make an online code compiler/code judge — I may eventually write an eBook for it since it is a lot to be covered in a blog post. I’ll of course share the news here if I ever get around to it. :)
For now, you can find the code for the program here: https://github.com/sknepal/webcompiler.
PS. Wrote the post on a single sitting; so please try to ignore the shortcomings you see. :P