-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(singlejar): Add Log4j plugins cache combiner #22581
feat(singlejar): Add Log4j plugins cache combiner #22581
Conversation
@sgowroji what question do I need to answer for "awaiting-user-response" to get removed? @softprops @jwilliams-ocient from #7330; does this help the issue for you? |
@stevebarrau Could you please take a look at the failing checks? |
659623f
to
d8130cb
Compare
@sgowroji PTAL. |
cc @cushon |
Ping @cushon |
@cushon Can you please take a look? |
@hvadehra @meteorcloudy for Blaze we don't need to support log4j, so supporting this isn't a priority for me. I also don't want to stand in the way of progress if you want to support this in Bazel. I left a note in #7330 about the idea of trying to make this kind of thing more pluggable, but there may not be great alternatives to doing it directly in singlejar. If that's something you want to pursue, perhaps the new 'combiner' here could be factored into a separate file and only wired up for Bazel. |
src/tools/singlejar/combiners.cc
Outdated
return nullptr; | ||
} | ||
|
||
void *outputEntryFromBuffer(const std::string filename, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of extracting this helper, would it be possible to share implementation with Concatenator
with composition, similar to what ManifestCombiner
does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Composed in e416b51
src/tools/singlejar/combiners.cc
Outdated
@@ -284,3 +295,152 @@ void *ManifestCombiner::OutputEntry(bool compress) { | |||
concatenator_->Append("\r\n"); | |||
return concatenator_->OutputEntry(compress); | |||
} | |||
|
|||
template<typename T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to move Log4J2PluginDatCombiner
into a separate file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extracted in 8e901f0
@cushon I would prefer to author this in Java to reuse the log4j merger logic from existing implementation. Ideally I agree with you if we could provide a set of custom combiners as configuration instead of having to add complexity/features in singlejar this would be very nice. Where could we start to pry open singlejar and allow external combiners to be configured? In the short term, if I address the comments, is this something that could live in singlejar until we rollout a configurability mechanism? |
@shs96c if this ends up being authorable in Java, would https://github.com/bazel-contrib/rules_jvm be a good home for this log4j jar combiner extension? |
|
3a95585
to
6ae3fb8
Compare
@cushon PTAL. |
Friendly ping @cushon. Ideally I would like this to make it into Bazel 8 and a 7 point release. |
Friendly ping @cushon. |
Sorry for the delay. Thanks for refactoring the combiner into a separate file, I'm fine with the current approach. (Internally we use a separate singlejar entry point with additional combiners, we may not wire up the new combiner there, which is fine.) @hvadehra do you want to review for Bazel? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My c++ is rusty but overall LGTM
the test data was generated by a python script, happy to commit it as well
Please do, and it would be great if we could generate the jar/dat outputs on the fly for the tests (perhaps in a genrule) rather than check them in.
I am not sure I understand the ask here. If the ask is for the 2 input jar to we generated by a genrule: these jar files are not stable because of zip data IIUC and they would be cached. Having them on disk circumvents this. If the ask is Bazel does not want to suffer a zx kind of scenario, we can audit the python code and generate those jars in a trusted context (or directly inspect the JARs, they are a single file). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR LGTM as is, not having to check in semi-inscrutable inputs would have just been nicer.
I'd like to get @pzembrod's input on the c++ code before importing.
} | ||
|
||
uint32_t readInt(std::istringstream &stream) { | ||
uint32_t values; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious nit: why plural "values"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular reason, updated in c80889e.
std::string readUTFString(std::istringstream &stream) { | ||
uint16_t length; | ||
stream.read(reinterpret_cast<char *>(&length), sizeof(length)); | ||
length = swapByteOrder(length); // Convert to host byte order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I totally lack context here - and a future code maintainer may do so, too: Could you add a sentence or two more as to why the byte order needs to be swapped? And maybe the best place for that comment would be at the swapByteOrder() function itself?
I'm guessing it is because JVM class files are big-endian, and x86 and most ARMs are little-endian?
I guess there is not much big-endian hardware around these days, but I may be wrong, it's a long time since last I encountered endianness as a topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment to that effect in c80889e.
…uires compiler flag '/std:c++17'
6021715
to
c80889e
Compare
Ping @cushon; IIUC we got approval from the Bazel and C++ side for this change. |
Friendly ping @cushon |
It appears that importing this will need a bit of work, I'll do that myself. |
Can this be merged into 7.5.0 ? (or whatever a 7.X release it can go into) |
That shouldn't be necessary. Once we have a java_tools / rules_java release with this change, one can just update those deps and stay on Bazel 7. |
@hvadehra I'm looking at java_tools but it's not clear how to map the release back to a specific Bazel commit? |
The java_tools releases have their provenance information embedded in the archives. For eg: if you look at https://github.com/bazelbuild/java_tools/releases/tag/java_v13.9 , download any one of the zip archives and look at the top-level The fix from this PR has not made it into a java_tools / rules_java release just yet. You can check the state at #24696 / bazelbuild/java_tools#93 |
Cool thanks; |
In singlejar, add support for combining Log4j2 plugins cache file.
Log4j2 plugins are Java annotations collected by a compiler plugin into .dat files for Log4j2 runtime to find them fast. With correct dependency on a
java_plugin
, Bazel already runs the java plugin compiler correctly. The behavior is correct onbazel run
JAR, but not on_deploy.jar
fat jars.The silent clobbering of files in general, and the example of these Log4j2 .dat files in particular, is discussed in #7330.
I tested this fix in my project using Bazel 7.0.2, compiling
//src:java_tools_prebuilt.zip
, and overriding@remote_java_tools_darwin_arm64
and@remote_java_tools_linux
.