Include brotli #34

v-python · 2020-06-06T03:20:51Z

It would be good to support all the compression, or at least decompression, schemes supported by servers and browsers. This looks like a great step forward to not having to write compression and decompression in javascript.

My use-case would be to download a bunch of brotli files in an archive (somewhat like tar), split them into individual files (unarchive, but don't decompress yet) in the service-worker cache, and return them to the browser as called for.

eustas · 2020-07-31T12:57:51Z

Mentioned in https://github.com/WICG/compression/blob/master/explainer.md#future-work

prajaybasu · 2021-01-14T06:05:03Z

This needs to be higher up in the priorities.

In my case, I need to download a large binary/json data file and Brotli's compression offers significant savings compared to Zopfli. I cannot use a CDN because the file is either too large or the CDN's on-the-go compression is not suitable for the file.

Also, Brotli is already included in most browsers as a part of H2.

ricea · 2021-01-22T01:41:27Z

@prajaybasu As far as I know, browsers are only shipping the decompression dictionary. Brotli has a separate dictionary needed for compression, which would significantly increase the size of the browser.

It's possible to add Brotli only to DecompressionStream and not to CompressionStream, but I feel it would be confusing.

So currently I am adopting a wait-and-see approach.

v-python · 2021-01-22T02:06:51Z

It will be more confusing to realize that one browser supported decompression algorithm is not available, and have to locate a Javascript decompressor for it, and figure out how to use it, to work around the fact that it isn't built in. Especially when it is the best compression for the type of data being sent from the server.

lucacasonato · 2022-02-03T12:28:10Z

@ricea Deno would be interested in a brotli option for DecompressionStream and CompressionStream. Even if it is not immediately standardized due to browser pushback, could we at least agree to the label that brotli will use? That way an addition in Deno will not be incompatible with a future browser implementation.

The two real contenders for brotli would be br and brotli I think. The former matches the value brotli has in content-encoding, but the latter is clearer to novices.

ricea · 2022-02-04T10:09:12Z

I've been matching the content-encoding header as a rule, so I'm inclined to choose br. Maybe we could have brotli as an alias? I'm not sure how I feel about that.

IIRC, Brotli has a number of customisable settings which we would ultimately want to expose through the API, but as with DEFLATE we can add them later.

lucacasonato · 2022-02-04T10:21:20Z

Sounds good. I'll throw up a tentative spec PR so we have something a little more formal.

jasnell · 2022-02-08T15:30:28Z

Both Node.js and Cloudflare Workers would also be interested in a brotli option for these. Looking forward to the spec PR.

devongovett · 2022-06-12T02:03:08Z

This would be great. I currently maintain brotli.js, a JavaScript-based brotli decompressor. Personally, I use it to decode WOFF2 fonts as part of fontkit, but it is very widely used (1.8M downloads per week on npm) by many others as well. It works well, but it is quite heavy in terms of JS payload size of the library itself, so it would be much better if I could use the native brotli decoders built into browsers instead.

eustas · 2022-06-12T10:46:52Z

Browser-wise it is better to add it as a native library wrapper, as it is done for gzip. This way, its footprint will be nearly zero - decoder library is already linked into browser.
For pure JS decoder (to enable polyfills) we also have java-to-js-transpiled decoder. For encoder - well, it would be nice to have native library; if not - I agree, WASM is a viable option.

stwrt · 2022-09-08T00:17:27Z

I discussed with some colleagues (who also work on WebKit), and we generally support the idea of adding Brotli to the Compression spec.

paralin · 2024-05-10T05:38:15Z

I am decompressing some files using DecompressionStream and being able to decompress brotli would save a lot of space. The browser already decodes Content-Encoding: br compression, so it should be possible to also expose this decoder in DecompressionStream.

annevk · 2024-08-14T06:33:48Z

@ricea @saschanaz @smaug---- what do you all think about adding Brotli now? WebKit would like to go for it. @lucacasonato still interested in writing a PR?

ricea · 2024-08-14T12:32:09Z

I'm coming around to the idea of shipping decompression only. Yes, it will confuse people but it provides value.

annevk · 2024-08-14T12:38:19Z

@ricea we would like to add both compression and decompression.

ricea · 2024-08-15T05:20:53Z

Enabling brotli for compression is difficult for Blink because we don't currently ship the compression side of the library and it has a 190KB binary size cost just for the built-in dictionary. Adding anything over 16KB to Chromium requires a good justification.

It's been suggested that we could ship with the built-in dictionary disabled, but that would provide an unsatisfactory experience for our users.

If we can ship decompression first we can hopefully demonstrate demand that will justify the large binary size increase.

Pomax · 2024-09-04T20:27:38Z

It should be fairly reasonable to argue that those 190KB are fine though? Even if it's just to enable client-side font generation, right now any website that needs a brotli encoder has to load half a megabyte of library, which thanks to per-site caching means users will need to download those solutions at least once for each website that uses one, and realistically thanks to modern bundling practices, multiple times for the same website because folks end up bundling the encoder into their app bundles and then push out updated bundles.

Saving many megabytes for all Blink and Blink-derivative users both in bandwidth and JS load/parse time by having Blink add 190KB to the binary sounds like a fairly big win for everyone.

paralin · 2024-09-04T20:49:15Z

190KB seems well worth it if not entirely negligible.

Edit: I personally only care about decompression :)

Timmmm · 2024-09-07T18:10:21Z

it has a 190KB binary size cost just for the built-in dictionary

Surely the dictionary has to be present for decompression already?

pkasting · 2024-09-07T21:51:00Z

it has a 190KB binary size cost just for the built-in dictionary

Surely the dictionary has to be present for decompression already?

#34 (comment)

pkasting · 2024-09-07T21:53:58Z

190KB seems well worth it if not entirely negligible.

As noted, we define <16kB to be negligible. I spent several months working on changes to improve internal APIs that I then did not land because their binary size cost was 20k. 190k is a very large chunk. It's not so large it could never be paid, but it's not something to be just given away on a nice-to-have.

Shipping decompression first and re-evaluating would make the tradeoffs clearer.

lxgr · 2024-09-08T03:14:39Z

Brotli has a separate dictionary needed for compression, which would significantly increase the size of the browser.

Isn't it the same dictionary used for both compression and decompression? If so, how would it make sense to include only decompression, but not compression?

The only large-ish encoder-only data I could find is the dictionary hash table (https://github.com/google/brotli/blob/master/c/enc/dictionary_hash.c), but that looks like it can be generated pretty easily from the dictionary itself, and otherwise also compresses pretty well itself (to <30 KB with gzip, for example).

So where is this 190 KB number coming from? Or is decompressing/building the dictionary at install or runtime considered infeasible too?

Timmmm · 2024-09-08T07:38:40Z

Yeah a dictionary is a dictionary surely? Presumably the extra size for compression is because it is indexed in some way to improve compression speed? Could that index generation not be done at runtime on demand?

pmarreck · 2024-09-08T12:32:44Z

@ricea :

Adding anything over 16KB to Chromium requires a good justification.

"There are likely petabytes of waste going across the wire today b/c someone was worried about < 200kb install size while also insisting that compression must be symmetrical lest it confuse ppl. Admittedly I'm reading this issue blind so I might be missing other context, but this feels very penny-wise, pound-foolish. " (from a Reddit comment in a thread about this)

saschanaz · 2024-09-10T08:59:03Z

Mozilla is open to add decompression support only. There certainly can be developer confusion, but as long as it's feature detectable we think it's manageable.

(let supportsBrotli = true; try { new Compression("brotli") } catch { supportsBrotli = false } is a bit too verbose, I wonder we should have .supports() function.)

lxgr · 2024-09-10T14:32:27Z

Another thought: Is the encoding dictionary even mandatory for compression? As far as I can tell from the spec, only decoders need to include it.

If that's really the point of contention, presumably it should be possible to just ignore it, at the expense of slightly worse compression performance for some small files.

ricea · 2024-09-11T06:59:47Z

The built-in compression dictionary is not mandatory, but it makes a huge difference for input of a few kilobytes or less. Predictability is very important to developers, and if two browsers both claim to support Brotli developers should be able to expect to get comparable levels of compression from them.

pmeenan · 2024-09-11T22:08:06Z

FWIW, the compression dictionary is mostly seeded with web-facing content that was expected to be compressed for the content-encoding use case (HTML, JS, CSS, etc). It's not immediately obvious that the use cases for CompressionStream (at least within a browser) would benefit as much from having the default dictionary available (or, optionally, each browser could choose to include the dictionary or not based on the device it is being installed on).

It wouldn't necessarily need to be detectable or something that the caller cares about, they'd just get (possibly) slightly less compression in some cases than they might in others but the resulting stream would still be smaller than gzip and compatible with existing decoders.

Bonus points if the api is considered for adding support for compression/decompression options like compression level, large window (at which point brotli would be more appropriate than br since br carries a specific window size config) and user-provided dictionaries.

ricea · 2024-09-12T11:31:57Z

Bonus points if the api is considered for adding support for compression/decompression options like compression level, large window (at which point brotli would be more appropriate than br since br carries a specific window size config) and user-provided dictionaries.

I want to design the option bag separately, as many of the options will be common between different compression algorithms.

pmeenan · 2024-09-12T16:53:28Z

I want to design the option bag separately, as many of the options will be common between different compression algorithms.

Sorry, yes, I just wouldn't want to add brotli with also addressing the option bag at the same time. At a minimum for compression level but it doesn't make much sense to provide algorithm control without also providing some level of control over the settings.

jesup · 2024-09-27T20:58:06Z

Personally I'm in favor of including compression. However, we do have other forms of compression available. It did sound in the breakout that there are some usecases (see above as well) that explicitly need brotli compression, and without it will need to download a polyfill for it, which is expensive.

jesup · 2024-09-27T21:25:38Z

Reading over the notes, I think my position is now: - include or not. Perhaps: Let's get decompression for brotli (and zstd) in, and have a separate issue for adding encoder support, since there isn't consensus yet that it's needed and worth the cost. Google clearly has issues with the cost to include it currently. Perhaps we can get the cost down as mentioned in the meeting.
https://www.w3.org/2024/09/25-brotli-irc

nektro · 2024-09-27T21:49:35Z

Bun would be interested in supporting this since we already ship with brotli support internally for other APIs

Timmmm · 2024-09-28T12:19:07Z

I think it's still unclear what the costs are. That IRC chat mentioned 190kB for the dictionary hash (the dictionary itself is already present for decompression), and also 400kB maybe for code (but was that talking about zstd? not super clear.) Would be worth answering:

What is the code size increase due to the extra compression code (not including any dictionary hashes)?
Can the dictionary hash be generated from the decompression dictionary at runtime? Is that acceptable?
Can the dictionary hash be stored on disk and loaded on demand? @ricea said "Adding anything over 16KB to Chromium requires a good justification." but I'm not sure if he meant download size, disk space, code size or RAM usage.

I did have a look at the dictionary hash (which is only 64kB btw) to see how long it takes to generate it at runtime. Unfortunately it seems like the code to generate it in the first place was never open sourced. At least I couldn't find it. It's possibly the same as this but I dunno, that's only used if BROTLI_EXPERIMENTAL is defined.

ricea · 2024-09-28T14:56:22Z

You can see the breakdown here (drill into third_party/brotli): https://chrome-supersize.firebaseapp.com/viewer.html?load_url=https%3A%2F%2Fstorage.googleapis.com%2Fchromium-binary-size-trybot-results%2Fandroid-binary-size%2F2024%2F09%2F26%2F2036235%2Fsupersize_diff.sizediff#focus=60

It appears to contain about 280KB of static data and 220KB of code.

Creating the dictionary hash at runtime might be acceptable, as long as it doesn't take a ridiculously long time.

It's mainly disk space we're worried about, so loading in demand doesn't help.

To summarise the discussion at TPAC 2024, we are going to add Brotli to the standard, but Chrome still won't be able to ship it for a while. It appears that Safari is ready to ship and Firefox is also positive on shipping.

Pomax · 2024-09-28T21:50:30Z

I have to admit I'm a little surprised that google, who invented brotli would have reservations about including it. They invented it for a reason, and being compression for the web is literally it =)

(Not to mention the current Chrome footprint. 500kb extra to save many times that over the lifetime of the browser, added to a 100+MB product when it's your own technology that you're adding to the product you literally designed the thing for, seems like a product manager no-brainer)

AlexanderOMara · 2024-10-18T22:04:40Z

Maybe a silly question, but: What if the Brotli compression dictionary was compressed with Brotli, which could be decompressed as needed using the decompression dictionary?

ricea · 2024-10-22T01:25:12Z

Maybe a silly question, but: What if the Brotli compression dictionary was compressed with Brotli, which could be decompressed as needed using the decompression dictionary?

I tried this once a long time ago, but the results were disappointing. I don't know why, as it does seem like it ought to work.

ricea mentioned this issue May 27, 2021

Shipping status? (Unity3D use cases) #37

Open

jameysharp mentioned this issue Jun 17, 2022

Add Brotli support to CompressionStream fastly/js-compute-runtime#114

Closed

jozefchutka mentioned this issue Sep 8, 2023

[Feature request] onnx model compression huggingface/transformers.js#284

Open

saschanaz mentioned this issue Sep 10, 2024

zstd support #54

Open

Timmmm mentioned this issue Sep 10, 2024

Dynamically generate the compression dictionary? google/brotli#1195

Open

annevk mentioned this issue Sep 16, 2024

Compression: Brotli Support w3c/tpac2024-breakouts#72

Open

PuruVJ mentioned this issue Oct 8, 2024

feat: GZIP clockworklabs/spacetimedb-typescript-sdk#109

Merged

exoego mentioned this issue Nov 2, 2024

Support for brotli (br) compression in compress middleware honojs/hono#3543

Open

Include brotli #34

Include brotli #34

Comments

v-python commented Jun 6, 2020

eustas commented Jul 31, 2020

prajaybasu commented Jan 14, 2021

ricea commented Jan 22, 2021

v-python commented Jan 22, 2021

lucacasonato commented Feb 3, 2022

ricea commented Feb 4, 2022

lucacasonato commented Feb 4, 2022

jasnell commented Feb 8, 2022

devongovett commented Jun 12, 2022

eustas commented Jun 12, 2022

stwrt commented Sep 8, 2022

paralin commented May 10, 2024

annevk commented Aug 14, 2024

ricea commented Aug 14, 2024

annevk commented Aug 14, 2024

ricea commented Aug 15, 2024

Pomax commented Sep 4, 2024 • edited Loading

paralin commented Sep 4, 2024 • edited Loading

Timmmm commented Sep 7, 2024

pkasting commented Sep 7, 2024

pkasting commented Sep 7, 2024

lxgr commented Sep 8, 2024 • edited Loading

Timmmm commented Sep 8, 2024

pmarreck commented Sep 8, 2024

saschanaz commented Sep 10, 2024

lxgr commented Sep 10, 2024

ricea commented Sep 11, 2024

pmeenan commented Sep 11, 2024 • edited Loading

ricea commented Sep 12, 2024

pmeenan commented Sep 12, 2024

jesup commented Sep 27, 2024

jesup commented Sep 27, 2024

nektro commented Sep 27, 2024

Timmmm commented Sep 28, 2024

ricea commented Sep 28, 2024

Pomax commented Sep 28, 2024 • edited Loading

AlexanderOMara commented Oct 18, 2024

ricea commented Oct 22, 2024

Pomax commented Sep 4, 2024 •

edited

Loading

paralin commented Sep 4, 2024 •

edited

Loading

lxgr commented Sep 8, 2024 •

edited

Loading

pmeenan commented Sep 11, 2024 •

edited

Loading

Pomax commented Sep 28, 2024 •

edited

Loading