Push images to GHCR for release-1.9 #2491

saileshd1402 · 2025-03-09T20:12:07Z

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):
Fixes #

Checklist:

Docs included if any changes are user facing

juliusvonkohout · 2025-03-10T16:10:40Z

/ok-to-test
you can use /retest to trigger the ci/cd again

juliusvonkohout · 2025-03-10T16:13:09Z

Please also sign your commits according to the DCO

# Sometimes we forget to sign commits and have to rebase
# sign the latest X commits with HEAD~X
git rebase --signoff HEAD~1
# You have to force push afterwards
git push -f

And please make sure that you sign with the email that is in the signing key that you have uploaded to github.

juliusvonkohout · 2025-03-10T16:13:44Z

Should this PR not be against the master branch @andreyvelich and then cherry picked to the release branch?

andreyvelich · 2025-03-10T16:48:47Z

Should this PR not be against the master branch @andreyvelich and then cherry picked to the release branch?

You are right. We removed only the Training Operator code from the master branch.

@saileshd1402 Please update the images for Kubeflow Trainer in the master branch, but for Training Operator images you need to create separate PR to update release-1.9 branch.

coveralls · 2025-03-10T16:52:23Z

Pull Request Test Coverage Report for Build 13823612117

Details

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage remained the same at 100.0%

Totals
Change from base Build 13635027331:	0.0%
Covered Lines:	85
Relevant Lines:	85

💛 - Coveralls

saileshd1402 · 2025-03-10T17:07:40Z

Should this PR not be against the master branch @andreyvelich and then cherry picked to the release branch?

You are right. We removed only the Training Operator code from the master branch.

@saileshd1402 Please update the images for Kubeflow Trainer in the master branch, but for Training Operator images you need to create separate PR to update release-1.9 branch.

That's what this PR is for correct? I thought this PR would be for 1.9 only. Should I do the same in master as well?

andreyvelich

Thank you for this @saileshd1402!

andreyvelich · 2025-03-11T11:41:28Z

.github/workflows/build-and-publish-images.yaml

+        id: publish
+        uses: ./.github/workflows/template-publish-image
+        with:
+          image: ghcr.io/kubeflow/${{ inputs.component-name }}


I think, we discussed it here, can we keep /trainer/ prefix in the image name: #2455 (comment)?

for v1.9, isn't it a good idea to keep training-operator as prefix instead of trainer?

Actually, great point!
Yes, let's use ghcr.io/kubeflow/training-operator/...

From documentation, looks like ghcr only allows to push to the same repository name (i.e ghcr.io/kubeflow/trainer). But I think there is no way to test publish to ghcr with training-operator or trainer until merge. Should I keep it as trainer itself?

andreyvelich · 2025-03-11T11:43:14Z

.github/workflows/build-and-publish-images.yaml

@@ -48,7 +48,18 @@ jobs:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}

-      - name: Publish Component ${{ inputs.component-name }}
+      - name: GHCR Login


Can you also update the image in the manifests: https://github.com/kubeflow/trainer/blob/release-1.9/manifests/overlays/standalone/kustomization.yaml

Should we update this in separate PR? I don't think I'm able to push the images from CI without it being merged first

Don't we publish images only when PR is merged ?
I thought on pull-request we only verify build.

Or you mean integration tests are failing due to invalid images in the manifests ?

yes, we publish only on merge, that's why I can't tell exact image tag that will be published, so what should I put in manifest?

tenzen-y · 2025-03-12T22:29:38Z

.github/workflows/publish-core-images.yaml

-          - component-name: training-operator-v2
-            dockerfile: cmd/training-operator.v2alpha1/Dockerfile
-            platforms: linux/amd64,linux/arm64,linux/ppc64le
-            tag-prefix: v2alpha1
-          - component-name: model-initializer-v2
-            dockerfile: cmd/initializer_v2/model/Dockerfile
-            platforms: linux/amd64,linux/arm64
-            tag-prefix: v2
-          - component-name: dataset-initializer-v2
-            dockerfile: cmd/initializer_v2/dataset/Dockerfile
-            platforms: linux/amd64,linux/arm64
-            tag-prefix: v2


I might be missing something. Why do we remove these?

I thought the idea was to not include v2 images in the 1.9 branch

But, we are publishing v2 manager as part of KF release as we can see: https://github.com/kubeflow/manifests/tree/master/apps/training-operator/upstream/v2

@andreyvelich Do you want to remove v2 manifests from KF v1.9 release?

Yes, I think since there are not stable, I prefer to remove them.
We don't have e2e tests or working example in the release branch.

SGTM
@juliusvonkohout we are considering of removal https://github.com/kubeflow/manifests/tree/master/apps/training-operator/upstream/v2 from KF 1.9

V2 is not yet installed by default in Kubeflow Platform 1.9.1 or 1.10.0, we are still on V1. So there is no harm in having them there. Especially once I add integration tests for v2 before enabling v2 by default, we need the v2 manifests there anyway.

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

andreyvelich · 2025-03-13T12:41:45Z

.github/workflows/build-and-publish-images.yaml

+        id: publish-ghcr
+        uses: ./.github/workflows/template-publish-image
+        with:
+          image: ghcr.io/kubeflow/trainer/${{ inputs.component-name }}


We can't use the training-operator prefix since we renamed the repo, right ?

What about this image, can we re-use it ?
https://github.com/orgs/kubeflow/packages/container/package/training%2Ftraining-operator

We can't use the training-operator prefix since we renamed the repo, right ?

yes, from what I understand. but we can't check unless PR is merged I think

What about this image, can we re-use it ?

I can make the changes, but i'm not sure how to test if it will work

We can try to merge it, and see if the post-merge job succeed.
We ignore the pushing action in the PR.

Let's try to existing repo and see if that works:

ghcr.io/kubeflow/training/training-operator:<SHA> ghcr.io/kubeflow/training/storage-initializer:<SHA>

Let's also use the v1.9 as a tag prefix, since it would be more easier to understand for users what is the version of Training Operator it is.

@kubeflow/wg-training-leads @thesuperzapper @juliusvonkohout Any objections ?

ghcr.io/kubeflow/training/training-operator:v1.9.1
ghcr.io/kubeflow/training/storage-initializer:v1.9.1

or

ghcr.io/kubeflow/training/training-operator:SHA
ghcr.io/kubeflow/training/storage-initializer:SHA

makes sense, but i am not sure how you want to mix them.

And yes speed is more important if i am supposed to release RC.3 tomorrow.

Apologies for the delay. I have updated the prefix as suggested PTAL

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

andreyvelich · 2025-03-17T20:35:37Z

.github/workflows/build-and-publish-images.yaml

+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Publish Component ${{ inputs.component-name }} to Dockerhub


Can we consolidate these 2 actions to one similar to this PR: #2537 ?

Thanks for mentioning, I've updated

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

tenzen-y

Thank you
/approve

I will leave final approval on @andreyvelich

andreyvelich

Thanks @saileshd1402!
/lgtm
/approve

google-oss-prow · 2025-03-18T17:49:43Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich, tenzen-y

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [andreyvelich,tenzen-y]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow bot added the do-not-merge/work-in-progress label Mar 9, 2025

google-oss-prow bot requested review from jinchihe and kuizhiqing March 9, 2025 20:12

google-oss-prow bot added the size/M label Mar 9, 2025

google-oss-prow bot added the ok-to-test label Mar 10, 2025

saileshd1402 force-pushed the ghcr-images-release-v1.9 branch from 38bf74c to ec3b1aa Compare March 10, 2025 22:45

saileshd1402 marked this pull request as ready for review March 10, 2025 22:45

google-oss-prow bot removed the do-not-merge/work-in-progress label Mar 10, 2025

andreyvelich reviewed Mar 11, 2025

View reviewed changes

andreyvelich mentioned this pull request Mar 12, 2025

Implemenet MPI Plugin for OpenMPI #2493

Merged

1 task

tenzen-y reviewed Mar 12, 2025

View reviewed changes

saileshd1402 added 6 commits March 12, 2025 22:46

push images to ghcr

6ad916d

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

typo

9f9f197

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

remove trainer images (training-operator-v2)

77dbb8a

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

test push image

935253f

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

test push image

3659955

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

test ref branch to release-1.9 instead of master

884bfcd

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

saileshd1402 force-pushed the ghcr-images-release-v1.9 branch from 873597d to 884bfcd Compare March 12, 2025 22:47

saileshd1402 added 5 commits March 12, 2025 22:52

update github actions identifier

22b44f5

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

update branch

81f6a97

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

fix typo

6219ec8

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

test pushing images

3439f2a

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

update repository name

666d5ae

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

saileshd1402 changed the title ~~Push images to GHCR~~ Push images to GHCR for release-1.9 Mar 12, 2025

saileshd1402 added 4 commits March 12, 2025 23:24

fix typo

628d7c1

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

test pushing images

619ffcd

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

test pushing images

a20ba57

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

update image path to trainer

c6a453d

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

andreyvelich reviewed Mar 13, 2025

View reviewed changes

This was referenced Mar 13, 2025

[feature] migrate images to ghcr #2455

Merged

Migrate images in Dockerhub to GHCR #2446

Closed

use kubeflow/training/ prefix for ghcr

105d17a

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

andreyvelich reviewed Mar 17, 2025

View reviewed changes

andreyvelich mentioned this pull request Mar 18, 2025

fix(ci): update test-go coverage ci config and replace trainer badge with new address. #2534

Merged

1 task

consolidate ghcr and dockerhub

7ca559d

Signed-off-by: sailesh duddupudi <saileshradar@gmail.com>

tenzen-y reviewed Mar 18, 2025

View reviewed changes

google-oss-prow bot added the approved label Mar 18, 2025

andreyvelich reviewed Mar 18, 2025

View reviewed changes

google-oss-prow bot assigned andreyvelich Mar 18, 2025

google-oss-prow bot added the lgtm label Mar 18, 2025

google-oss-prow bot merged commit f654b1e into kubeflow:release-1.9 Mar 18, 2025
53 checks passed

saileshd1402 mentioned this pull request Mar 18, 2025

Update Manifest Images to GHCR #2544

Merged

1 task

Electronic-Waste mentioned this pull request Mar 19, 2025

fix(ci): Change publish dir from training to trainer #2546

Merged

1 task

Push images to GHCR for release-1.9 #2491

Push images to GHCR for release-1.9 #2491

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Pull Request Test Coverage Report for Build 13823612117

Details

💛 - Coveralls

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!