Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(singlejar): Add Log4j plugins cache combiner #22581

Conversation

stevebarrau
Copy link
Contributor

@stevebarrau stevebarrau commented May 29, 2024

In singlejar, add support for combining Log4j2 plugins cache file.

Log4j2 plugins are Java annotations collected by a compiler plugin into .dat files for Log4j2 runtime to find them fast. With correct dependency on a java_plugin, Bazel already runs the java plugin compiler correctly. The behavior is correct on bazel run JAR, but not on _deploy.jar fat jars.

The silent clobbering of files in general, and the example of these Log4j2 .dat files in particular, is discussed in #7330.

I tested this fix in my project using Bazel 7.0.2, compiling //src:java_tools_prebuilt.zip, and overriding @remote_java_tools_darwin_arm64 and @remote_java_tools_linux.

@stevebarrau stevebarrau changed the title feat(singlejar): Add Log4j2Plugins combiner feat(singlejar): Add Log4j plugins cache combiner May 29, 2024
@github-actions github-actions bot added team-Rules-Java Issues for Java rules awaiting-review PR is awaiting review from an assigned reviewer labels May 29, 2024
@sgowroji sgowroji added awaiting-user-response Awaiting a response from the author and removed awaiting-review PR is awaiting review from an assigned reviewer labels May 29, 2024
@stevebarrau
Copy link
Contributor Author

@sgowroji what question do I need to answer for "awaiting-user-response" to get removed?

@softprops @jwilliams-ocient from #7330; does this help the issue for you?

@sgowroji
Copy link
Member

@stevebarrau Could you please take a look at the failing checks?

@stevebarrau stevebarrau force-pushed the singlejar-add-log4j-plugin-cache-combiner branch from 659623f to d8130cb Compare May 30, 2024 09:06
@stevebarrau
Copy link
Contributor Author

@sgowroji PTAL.

@sgowroji sgowroji added awaiting-review PR is awaiting review from an assigned reviewer and removed awaiting-user-response Awaiting a response from the author labels May 30, 2024
@hvadehra
Copy link
Member

cc @cushon

@stevebarrau
Copy link
Contributor Author

Ping @cushon

@meteorcloudy meteorcloudy requested a review from cushon August 20, 2024 17:48
@meteorcloudy
Copy link
Member

@cushon Can you please take a look?

@cushon
Copy link
Contributor

cushon commented Aug 20, 2024

@hvadehra @meteorcloudy for Blaze we don't need to support log4j, so supporting this isn't a priority for me. I also don't want to stand in the way of progress if you want to support this in Bazel. I left a note in #7330 about the idea of trying to make this kind of thing more pluggable, but there may not be great alternatives to doing it directly in singlejar. If that's something you want to pursue, perhaps the new 'combiner' here could be factored into a separate file and only wired up for Bazel.

return nullptr;
}

void *outputEntryFromBuffer(const std::string filename,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of extracting this helper, would it be possible to share implementation with Concatenator with composition, similar to what ManifestCombiner does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Composed in e416b51

@@ -284,3 +295,152 @@ void *ManifestCombiner::OutputEntry(bool compress) {
concatenator_->Append("\r\n");
return concatenator_->OutputEntry(compress);
}

template<typename T>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to move Log4J2PluginDatCombiner into a separate file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted in 8e901f0

@stevebarrau
Copy link
Contributor Author

@cushon I would prefer to author this in Java to reuse the log4j merger logic from existing implementation. Ideally I agree with you if we could provide a set of custom combiners as configuration instead of having to add complexity/features in singlejar this would be very nice.

Where could we start to pry open singlejar and allow external combiners to be configured?

In the short term, if I address the comments, is this something that could live in singlejar until we rollout a configurability mechanism?

@stevebarrau
Copy link
Contributor Author

@shs96c if this ends up being authorable in Java, would https://github.com/bazel-contrib/rules_jvm be a good home for this log4j jar combiner extension?

@shs96c
Copy link
Contributor

shs96c commented Aug 21, 2024

contrib_rules_jvm is too high a level. You might want this in rules_jvm_external so maven artifacts we create are correct, but the only way for this to work with a deploy jar is to have this in rules_Java or wherever singlejar lives.

@stevebarrau stevebarrau force-pushed the singlejar-add-log4j-plugin-cache-combiner branch 2 times, most recently from 3a95585 to 6ae3fb8 Compare October 1, 2024 13:19
@stevebarrau stevebarrau requested a review from cushon October 1, 2024 15:43
@stevebarrau
Copy link
Contributor Author

@cushon PTAL.

@stevebarrau
Copy link
Contributor Author

stevebarrau commented Oct 10, 2024

Friendly ping @cushon. Ideally I would like this to make it into Bazel 8 and a 7 point release.

@stevebarrau
Copy link
Contributor Author

Friendly ping @cushon.

@cushon cushon requested a review from hvadehra November 12, 2024 18:10
@cushon
Copy link
Contributor

cushon commented Nov 12, 2024

Sorry for the delay.

Thanks for refactoring the combiner into a separate file, I'm fine with the current approach. (Internally we use a separate singlejar entry point with additional combiners, we may not wire up the new combiner there, which is fine.)

@hvadehra do you want to review for Bazel?

Copy link
Member

@hvadehra hvadehra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My c++ is rusty but overall LGTM

the test data was generated by a python script, happy to commit it as well

Please do, and it would be great if we could generate the jar/dat outputs on the fly for the tests (perhaps in a genrule) rather than check them in.

@hvadehra hvadehra requested a review from pzembrod November 13, 2024 08:19
@stevebarrau
Copy link
Contributor Author

I am not sure I understand the ask here.

If the ask is for the 2 input jar to we generated by a genrule: these jar files are not stable because of zip data IIUC and they would be cached. Having them on disk circumvents this.

If the ask is Bazel does not want to suffer a zx kind of scenario, we can audit the python code and generate those jars in a trusted context (or directly inspect the JARs, they are a single file).

Copy link
Member

@hvadehra hvadehra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR LGTM as is, not having to check in semi-inscrutable inputs would have just been nicer.

I'd like to get @pzembrod's input on the c++ code before importing.

}

uint32_t readInt(std::istringstream &stream) {
uint32_t values;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious nit: why plural "values"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason, updated in c80889e.

std::string readUTFString(std::istringstream &stream) {
uint16_t length;
stream.read(reinterpret_cast<char *>(&length), sizeof(length));
length = swapByteOrder(length); // Convert to host byte order
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I totally lack context here - and a future code maintainer may do so, too: Could you add a sentence or two more as to why the byte order needs to be swapped? And maybe the best place for that comment would be at the swapByteOrder() function itself?
I'm guessing it is because JVM class files are big-endian, and x86 and most ARMs are little-endian?
I guess there is not much big-endian hardware around these days, but I may be wrong, it's a long time since last I encountered endianness as a topic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment to that effect in c80889e.

@stevebarrau stevebarrau force-pushed the singlejar-add-log4j-plugin-cache-combiner branch from 6021715 to c80889e Compare November 20, 2024 11:21
@stevebarrau
Copy link
Contributor Author

Ping @cushon; IIUC we got approval from the Bazel and C++ side for this change.

@stevebarrau
Copy link
Contributor Author

Friendly ping @cushon

@cushon
Copy link
Contributor

cushon commented Dec 2, 2024

Friendly ping @cushon

I'm deferring to @hvadehra to review

@cushon cushon removed their request for review December 2, 2024 17:40
@hvadehra hvadehra added awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally and removed awaiting-review PR is awaiting review from an assigned reviewer awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally labels Dec 3, 2024
@hvadehra hvadehra self-assigned this Dec 3, 2024
@hvadehra
Copy link
Member

hvadehra commented Dec 3, 2024

It appears that importing this will need a bit of work, I'll do that myself.

@copybara-service copybara-service bot closed this in cd4580f Dec 9, 2024
@fzakaria
Copy link
Contributor

fzakaria commented Dec 9, 2024

Can this be merged into 7.5.0 ? (or whatever a 7.X release it can go into)

@hvadehra
Copy link
Member

Can this be merged into 7.5.0 ? (or whatever a 7.X release it can go into)

That shouldn't be necessary. Once we have a java_tools / rules_java release with this change, one can just update those deps and stay on Bazel 7.

@fzakaria
Copy link
Contributor

fzakaria commented Jan 8, 2025

@hvadehra I'm looking at java_tools but it's not clear how to map the release back to a specific Bazel commit?
I"m trying to prove to myself what version of rules_java has this fix.

@hvadehra
Copy link
Member

hvadehra commented Jan 9, 2025

The java_tools releases have their provenance information embedded in the archives. For eg: if you look at https://github.com/bazelbuild/java_tools/releases/tag/java_v13.9 , download any one of the zip archives and look at the top-level README file inside.

The fix from this PR has not made it into a java_tools / rules_java release just yet. You can check the state at #24696 / bazelbuild/java_tools#93

@fzakaria
Copy link
Contributor

fzakaria commented Jan 9, 2025

Cool thanks;
I saw there are tags for the releases -- maybe I'l open a suggestion if it's possible to commit to the repo the provenance so that it's browsable in the GitHub repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Rules-Java Issues for Java rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants