-
Notifications
You must be signed in to change notification settings - Fork 42
[SPARK-40516] Add Apache Spark 3.3.0 Dockerfile #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
89ef9ec
to
5652970
Compare
d42ee5c
to
546f1d1
Compare
After this patch merged, we can update the URL to Meanwhile, we will add more CI and generate script in followup PRs. |
Review notes: You can review SparkR specifc change
PySpark specifc change
PySpark and SparkR specifc change
|
with: | ||
spark: 3.3.0 | ||
scala: 2.12 | ||
java: 11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we have images for java: 8
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the initial PR (this PR), there won't be. Main consideration as below:
-
Considering that https://hub.docker.com/r/apache/spark currently only has java 11.
-
The speed of DOI's PR review of new images will be relatively slow. Our top priority is to complete the review of the first image dockerfile. After this, update review will be very soon, only 2-3 days.
-
As planned, we will add some scripts to automatically generate dockerfiles of different versions in follow up(such as java/scala/spark version).
But in future, we will consider add all java versions for spark supported (Of course it depends on community demand).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, I ask it just because java 8 seems faster than 11/17 for now.
@HyukjinKwon @zhengruifeng Thanks, I will merge this PR soon! |
so random question, what are the Apache requirements for LICENSE file and copyright notices for docker files? Especially if we are going to actually release the images. Sorry if I missed it on mailing list discussion |
@tgravescs Thanks Tom, it's a very import reminder! Just like apache/spark, all dockerfiles are under What we need to do just according https://www.apache.org/licenses/LICENSE-2.0#apply:
|
Ok, I'm mostly curious about the actual docker image publishing, do we need special NOTICE-binary files or anything to be able to properly publish? |
@tgravescs From existing apache repo, it doesn't inculded BTW, the image might included two mean:
|
### What changes were proposed in this pull request? This pach adds LICENSE and NOTICE: - LICENSE: https://www.apache.org/licenses/LICENSE-2.0.txt - NOTICE: https://github.com/apache/spark/blob/master/NOTICE ### Why are the changes needed? https://www.apache.org/licenses/LICENSE-2.0#apply See also: #2 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No need Closes #6 from Yikun/SPARK-40754. Authored-by: Yikun Jiang <yikunkero@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
What changes were proposed in this pull request?
This patch adds Apache Spark 3.3.0 Dockerfile:
Why are the changes needed?
This is needed by Docker Official Image
See also in: https://docs.google.com/document/d/1nN-pKuvt-amUcrkTvYAQ-bJBgtsWb9nAkNoVNRM2S2o
Does this PR introduce any user-facing change?
No
How was this patch tested?
The action won't be triggered until the workflow is merged to the default branch, so I can only test it in my local repo: