-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better font-finding heuristics, with a shortcut and a few caches. #84
base: master
Are you sure you want to change the base?
Conversation
There was some concern raised about using the file-sorting based shortcut. I've just had a look at instead baking the fontfile sort-info into the precompilation step, and that seems to work a treat. |
2d911ef
to
dcaab0e
Compare
Some fonts may have multiple styles that count as "regular", in which case the chosen variant will just be the first seen (whichever that is). We might as well be deliberate about this, and go for "more regular" (like regular) styles over "less regular" ones (like book and medium).
It's entirely possible that multiple forms of a font may be present on one system, for instance an OTF and also a PFB from texlive. We care about this because different formats have different levels of capability, and we don't want to choose a worse one simply because it's the first format considered. As such, a format component is added to the font scoring heuristic.
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #84 +/- ##
==========================================
+ Coverage 95.26% 95.53% +0.26%
==========================================
Files 6 6
Lines 317 336 +19
==========================================
+ Hits 302 321 +19
Misses 15 15 ☔ View full report in Codecov by Sentry. |
dcaab0e
to
3ce7edf
Compare
If we're caching paths at precompilation time, that will pose issues for relocatability I think.. Maybe this should not be done at that point but at init. It could be made async so that it doesn't delay package loading, but then some locking mechanism would need to be in place for usage. |
Relocatability does make this interesting, but I think we need to be clearer on what the potential "issues" might be. In this case, the primary consequence of a precompiled value that doesn't match the actual system is stale cache entries which will be replaced/ignored at runtime. I.e. this doesn't affect correctness, but might affect some performance considerations. Do you know of any precompile result relocation happening in practice? It could help to have a more concrete example to discuss this in the context of. |
Yes the issue would be that the baked in fonts in the dict probably do not transfer to another system. So you wouldn't gain much speed there.
At work, we use sysimages compiled on CI runners and transferred to other runners. There we run into relocatability issues all the time. |
Right. So it sounds like what we really want is a per-system font cache file? Would |
@jkrumbiegel any further thoughts on this? |
Not really further, I also think using a scratch space might be the reasonable way to go. Store a file with all the names and paths there and only repopulate the info when files change. |
I initially started looking at this package when I found that it doesn't always look in the right directories for fonts (see #82). However, I recently found that the heuristic for font finding itself is both slow (#67) and a bit dodgy (#83).
So, I've gone through the font-finding code and made a few changes:
plex
inibm plex sans italic
is worth more thansans
)regular
overmedium
)otf
overpfb
)Introduced a considered shortcut when scoring each font fileThe list of matching font-file is pre-sorted according to matches of the search string in the font file nameWe calculate the maximum possible family and style score given the search stringWe return the current best font early when:We have found a font that maximum scoreWe have seen more than twice as many fonts as the last font with the maximum score seenFrom the test that I've performed locally, this batch of changes results in faster, better initial lookups, and faster (again) subsequent lookups.
Here are some test results on my machine:
Subsequent identical searches take ~0.0002s.
If more people could perform adversarial (but not pathological) tests with this scheme, that would be much appreciated.