Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm honored that Q1K3, among many other entries to js13kGames 2021, is compressed with Roadroller [1], which is the reason that its (compressed) source code is basically a random-looking string plus some bit of JS code.

[1] https://github.com/lifthrasiir/roadroller/



Roadroller was the reason I was able to find the second level in. Thank you so much! :)


Do you mind commenting on what makes a “heavyweight javascript demo packer” different from any JS minifier?

Are there clever demo specific tricks it uses?


As others commented on it is actually a compression algorithm in disguise, but it does have some tricks.

- js13kGames measures the file size by its ZIP archive size, where the ZIP file itself is provided by the participant. In the past people used ZIP recompressors like ADVZIP or ECT to minimize the ZIP file. Roadroller needs to store a large amount of binary data, but JS string literals are not efficient in terms of information entropy (~6.96 bits per byte [1]). Therefore Roadroller actually stores 6-bit code which can be encoded in a very efficient Huffman tree. The 6-bit code is ASCII only, so it is much easier to handle.

- Roadroller tunes a lot of compression parameters during its optimization phase. It is normally a curse that you have to bring your own decoder, but Roadroller generates a decoder optimized specifically for given compressed data.

- Roadroller is allowed to transform the JS code input, so it tries to strategically insert a whitespace around tokens for better compression (see Packer.prepareJs comments for more information). So it doesn't necessarily produce the most minified code---Roadroller currently doesn't care about JS syntax and only does tokenization---but it will produce the most compressible code.

[1] Assuming UTF-8. If the file is encoded in ISO-8859-1 or similar this can be greatly improved: there are 4 or 5 forbidden bytes [2] out of 256 possible octets so ~7.977 bits per byte is possible.

[2] $, `, \, \r. When included in the <script> element \0 is also included.


That's wild. Would there ever be a usecase for this in a normal web app? I.e. would it be possible that decoding could take less time than downloading the extra data?


Absolutely not. Roadroller currently targets the maximum latency of 1 second for "typical" inputs (~50 KB), so it would translate to ~50 KB/s decompression rate. If we were able to get any gain out of it the communication channel should have been much slower (~5 KB/s). At that point we would just avoid JS anyway.

It is possible to make it an order of magnitude faster (the decoder is size-optimized, not speed-optimized), but its usefulness is still limited. The algorithm (context mixing) runs a large number of context models both for the compression and decompression, so its run time is proportional to the number of bytes being compressed. In the other words, if the input size is 1 MB the run time is proportional to that 1 MB, no matter it compressed into 200 KB or 10 KB. The only way to make the decompression faster is to reduce the input size via preprocessing; that preprocessor would have to run the context models to be efficient however.

I did hear of a semi-serious use case where an embedded device has to ship a JS front end to its API and its ROM is limited and can't be expanded. In that case it would make more sense to put a custom compression algorithm. However a more reasonable answer would be to force `Content-Encoding: br` to the client and store a Brotli stream (which would be at most 10% larger than Roadroller anyway).


Really appreciate these details. Thank you.


It appears to actually compress your JS, rather than make it "shorter but still valid".

The code ends up being something like:

    eval(decompress("hfkfkvj..."));
    function decompress(data) {...}


Interesting. So then it’s just better lossless compression than gzip?


Not necessarily better than gzip, as you have to add the size of the unpacker to the payload.


Note that while you are correct in principle (for example it will perform poorly for LZ77 best cases), Roadroller reaches a very high compression ratio when used with the typical minified JS input and DEFLATE recompressors.

I have kept an internal benchmark using some of js13kGames 2020 entries [1] and for those samples Roadroller universally beat gzip and in most cases even Brotli by a large margin (~8%) even with the embedded decoder. Of course Roadroller is much more computationally expensive though.

[1] https://docs.google.com/spreadsheets/d/1dKJWpLPY9fQ2J_COffOJ...


Thanks for the detailed reply.

Your name looks familiar. I think I've seen you around some code golf hangouts somewhere on the internet.


Was that the Anarchy Golf? :-) Otherwise I've been here and there so I might not remember all the places I've ever been.


Yes I think so. I used to be moderately active there a few years ago, although not a "regular".


I also used Roadroller for my js13k game this year! It's a massive help to the community.


Looks great. Could you highlight why it’s much faster in Node 16 as opposed to 14?


No idea. Low-level JS performance is very sensitive to small changes and I haven't tried to look at why. For example I've seen a situation where a simple well-predictable conditional deoptimized the entire method but I couldn't see exactly why (I've rewritten that bit of code anyway). It was one of the reasons that Roadroller now uses a WebAssembly implementation by default.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: