Some packet types will be sent thousands of times, it is quite likely that lz4 (or zstandard) would be able to perform a lot better if we trained it first. We could bencode / rencode a bunch of common strings and train it with that. We could send the dictionary to the other end as part of the handshake.
Another good training data set is pixel data: we use lz4 for very small areas (ie: cursors in terminals) and this doesn't compress well at the moment. Maybe this could be helped with training. (ie: repeated 0xffffff00 for white pixels)
There are python bindings for zstandard, including dictionary access. See pypi: zstandard: Note: When using dictionary data and compress() is called multiple times, the ZstdCompressionParameters? derived from an integer compression level and the first compressed data’s size will be reused for all subsequent operations. This may not be desirable if source data size varies significantly.
So maybe use two different contexts? One for packet metadata and one for pixel data?
Milestone renamed
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/2055