8000 Move generated Python models into kubeflow_trainer_api package by kramaranya · Pull Request #2632 · kubeflow/trainer · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Move generated Python models into kubeflow_trainer_api package #2632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

kramaranya
Copy link
Contributor
@kramaranya kramaranya commented May 2, 2025

What this PR does / why we need it:
Move generated Python models from sdk/kubeflow/trainer/models into a new api/kubeflow_trainer_api/models package.

Which issue(s) this PR fixes :
Fixes #2629

Checklist:

  • Docs included if any changes are user facing

@google-oss-prow google-oss-prow bot added t 8000 he size/XXL label May 2, 2025
@kramaranya kramaranya force-pushed the separate-models-from-sdk branch 2 times, most recently from b97d73e to ea46447 Compare May 2, 2025 10:46
Copy link
Contributor
@szaher szaher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kramaranya for the PR
can we use the swagger_config.json to do type mapping of the k8s native types to k8s native python client?

@szaher
Copy link
Contributor
szaher commented May 2, 2025

type mapping will reduce this pr massively and will make our models very lite. something like

{
  "packageName": "kubeflow.trainer",
  "typeMappings": {
    "io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta": "V1ObjectMeta",
    "io.k8s.apimachinery.pkg.apis.meta.v1.ListMeta": "V1ListMeta",
    "io.k8s.apimachinery.pkg.apis.meta.v1.LabelSelector": "V1LabelSelector",
    "io.k8s.apimachinery.pkg.apis.meta.v1.DeleteOptions": "V1DeleteOptions",
    "io.k8s.apimachinery.pkg.apis.meta.v1.OwnerReference": "V1OwnerReference",
    "io.k8s.apimachinery.pkg.apis.meta.v1.Status": "V1Status",
    "io.k8s.apimachinery.pkg.apis.meta.v1.WatchEvent": "V1WatchEvent",

    "io.k8s.api.core.v1.Pod": "V1Pod",
    "io.k8s.api.core.v1.PodSpec": "V1PodSpec",
    "io.k8s.api.core.v1.PodStatus": "V1PodStatus",
    "io.k8s.api.core.v1.Container": "V1Container",
    "io.k8s.api.core.v1.ContainerPort": "V1ContainerPort",
    "io.k8s.api.core.v1.EnvVar": "V1EnvVar",
    "io.k8s.api.core.v1.EnvVarSource": "V1EnvVarSource",
    "io.k8s.api.core.v1.Volume": "V1Volume",
    "io.k8s.api.core.v1.VolumeMount": "V1VolumeMount",
    "io.k8s.api.core.v1.ResourceRequirements": "V1ResourceRequirements",
    "io.k8s.api.core.v1.Service": "V1Service",
    "io.k8s.api.core.v1.ServiceSpec": "V1ServiceSpec",
    "io.k8s.api.core.v1.ServiceStatus": "V1ServiceStatus",

    "io.k8s.api.apps.v1.Deployment": "V1Deployment",
    "io.k8s.api.apps.v1.DeploymentSpec": "V1DeploymentSpec",
    "io.k8s.api.apps.v1.DeploymentStatus": "V1DeploymentStatus",
    "io.k8s.api.apps.v1.Scale": "V1Scale",
    "io.k8s.api.apps.v1.RollingUpdateDeployment": "V1RollingUpdateDeployment"
  },
  "importMappings": {
    "io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta": "kubernetes.client.models.v1_object_meta.V1ObjectMeta",
    "io.k8s.apimachinery.pkg.apis.meta.v1.ListMeta": "kubernetes.client.models.v1_list_meta.V1ListMeta",
    "io.k8s.apimachinery.pkg.apis.meta.v1.LabelSelector": "kubernetes.client.models.v1_label_selector.V1LabelSelector",
    "io.k8s.apimachinery.pkg.apis.meta.v1.DeleteOptions": "kubernetes.client.models.v1_delete_options.V1DeleteOptions",
    "io.k8s.apimachinery.pkg.apis.meta.v1.OwnerReference": "kubernetes.client.models.v1_owner_reference.V1OwnerReference",
    "io.k8s.apimachinery.pkg.apis.meta.v1.Status": "kubernetes.client.models.v1_status.V1Status",
    "io.k8s.apimachinery.pkg.apis.meta.v1.WatchEvent": "kubernetes.client.models.v1_watch_event.V1WatchEvent",

    "io.k8s.api.core.v1.Pod": "kubernetes.client.models.v1_pod.V1Pod",
    "io.k8s.api.core.v1.PodSpec": "kubernetes.client.models.v1_pod_spec.V1PodSpec",
    "io.k8s.api.core.v1.PodStatus": "kubernetes.client.models.v1_pod_status.V1PodStatus",
    "io.k8s.api.core.v1.Container": "kubernetes.client.models.v1_container.V1Container",
    "io.k8s.api.core.v1.ContainerPort": "kubernetes.client.models.v1_container_port.V1ContainerPort",
    "io.k8s.api.core.v1.EnvVar": "kubernetes.client.models.v1_env_var.V1EnvVar",
    "io.k8s.api.core.v1.EnvVarSource": "kubernetes.client.models.v1_env_var_source.V1EnvVarSource",
    "io.k8s.api.core.v1.Volume": "kubernetes.client.models.v1_volume.V1Volume",
    "io.k8s.api.core.v1.VolumeMount": "kubernetes.client.models.v1_volume_mount.V1VolumeMount",
    "io.k8s.api.core.v1.ResourceRequirements": "kubernetes.client.models.v1_resource_requirements.V1ResourceRequirements",
    "io.k8s.api.core.v1.Service": "kubernetes.client.models.v1_service.V1Service",
    "io.k8s.api.core.v1.ServiceSpec": "kubernetes.client.models.v1_service_spec.V1ServiceSpec",
    "io.k8s.api.core.v1.ServiceStatus": "kubernetes.client.models.v1_service_status.V1ServiceStatus",

    "io.k8s.api.apps.v1.Deployment": "kubernetes.client.models.v1_deployment.V1Deployment",
    "io.k8s.api.apps.v1.DeploymentSpec": "kubernetes.client.models.v1_deployment_spec.V1DeploymentSpec",
    "io.k8s.api.apps.v1.DeploymentStatus": "kubernetes.client.models.v1_deployment_status.V1DeploymentStatus",
    "io.k8s.api.apps.v1.Scale": "kubernetes.client.models.v1_scale.V1Scale",
    "io.k8s.api.apps.v1.RollingUpdateDeployment": "kubernetes.client.models.v1_rolling_update_deployment.V1RollingUpdateDeployment"
  }
}

it should generate only the models required by the trainer and leave the k8s native CRDs to its official python client https://github.com/kubernetes-client/python/tree/master/kubernetes/client/models

@andreyvelich
Copy link
Member

@szaher I am not sure if that is going to work with the newer version of OpenAPI generator and pydantic.
Please check this PR: #2466

Have you tried your mapping to see if serialization is going to work ?

@@ -0,0 +1,42 @@
[build-system]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the pyproject and README to the subdirectory (e.g. /api/python_models or /api/python_api)?
Does this name sound good @kramaranya @szaher @astefanutti @Electronic-Waste @tenzen-y ?

Copy link
Contributor Author
@kramaranya kramaranya May 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyvelich sounds great to me! I've updated with /api/python_api, unless anyone else has objections

@kramaranya kramaranya force-pushed the separate-models-from-sdk branch from ea46447 to f283219 Compare May 3, 2025 01:29
Copy link
Member
@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I left a few comments.
/assign @astefanutti @Electronic-Waste @szaher @kubeflow/wg-training-leads

Comment on lines 51 to 55
# Remove empty api directory that shouldn't be generated
if [ -d "${PKG_ROOT}/api" ]; then
rm -rf "${PKG_ROOT}/api"
echo "Removing API directory"
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need it ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as here #2632 (comment)

echo "Removing unused files for the Python API"
git clean -f ${API_OUTPUT_PATH}/.openapi-generator
git clean -f ${API_OUTPUT_PATH}/.github
git clean -f ${API_OUTPUT_PATH}/test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need it if we set this property for OpenAPI generator ?
modelTests=false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modelTests=false stops generating test files, but some directories and files are still created by OpenAPI generator:

tree ./api/python_api -a -L 2
├── .github
│  └── workflows
├── .openapi-generator
│  ├── FILES
│  └── VERSION
├── README.md
├── kubeflow_trainer_api
│  ├── __init__.py
│  ├── __pycache__
│  ├── api
│  └── models
├── pyproject.toml
└── test
    └── __init__.py

so we clean those up afterward, but I’m fine leaving the directories in place if you prefer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so even with modelTest=false OpenAPI generator creates these dirs.
Yes, let's remove them.

packages = ["kubeflow_trainer_api"]

[tool.hatch.version]
path = "kubeflow_trainer_api/__init__.py"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add direct reference to allow this packages be direct referenced via git ?

[tool.hatch.metadata]
allow-direct-references = true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding, allow-direct-references = true allows your project to install dependencies via git, but it doesn't affect whether other projects can install this package from git, see https://hatch.pypa.io/1.13/config/metadata/#metadata-options
https://stackoverflow.com/questions/73316644/usage-of-direct-references-in-pyproject-toml-with-hatchling-backend
So since we don't have any direct references in our dependencies, I think we don't need to add it. Please correct me if I'm wrong

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see! Thanks for letting me know.

@@ -29,5 +29,7 @@ exclude: |
sdk/kubeflow/trainer/__init__.py|
sdk/kubeflow/trainer/api/__init__.py|
sdk/kubeflow/trainer/models/.*|
api/python_api/kubeflow_trainer_api/__init__.py|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we should exclude manually created init.py file from the pre-commit ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed this one, thanks

Signed-off-by: kramaranya <kramaranya15@gmail.com>
@kramaranya kramaranya force-pushed the separate-models-from-sdk branch from f283219 to 26436a7 Compare May 3, 2025 16:36
@kramaranya
Copy link
Contributor Author

Thanks @kramaranya for the PR can we use the swagger_config.json to do type mapping of the k8s native types to k8s native python client?

@szaher I've tried using type mappings with the k8s client models, but they're incompatible with our OpenAPI-generated pydantic models, causing errors during serialization/deserialization

@kramaranya
Copy link
Contributor Author

Is there anything else needed from my side to get this merged? I'd appreciate your reviews!
@andreyvelich @szaher @astefanutti @tenzen-y @Electronic-Waste @franciscojavierarceo @kubeflow/wg-training-leads

git clean -f ${API_OUTPUT_PATH}/.openapi-generator
git clean -f ${API_OUTPUT_PATH}/.github
git clean -f ${API_OUTPUT_PATH}/test
git clean -f ${PKG_ROOT}/api
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kramaranya Since the output for generator is api/python-api, what files are generated under api folder ?

-o "local/${API_OUTPUT_PATH}" \

Copy link
Contributor Author
@kramaranya kramaranya May 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It generates an empty __init__.py:

tree ./api/python_api/kubeflow_trainer_api/api -a -L 2
./api/python_api/kubeflow_trainer_api/api
└── __init__.py

1 directory, 1 file

and It's the same as this one:

# flake8: noqa
# import apis into api package

Since trainer_client.py should remain part of sdk (am I right saying that?), we don't need an api folder, therefore I clean it up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed that you use the subpath for the PKG_ROOT.
Lgtm

@andreyvelich
Copy link
Member F438

Thank you for this @kramaranya!
/lgtm
/assign @Electronic-Waste @astefanutti @tenzen-y

Copy link
Contributor
@astefanutti astefanutti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @kramaranya!

"models" has really become an overloaded term :)

Copy link
Member
@Electronic-Waste Electronic-Waste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kramaranya Thanks for this great contribution. LGTM!

/lgtm

Copy link
Member
@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 1fb792e into kubeflow:master May 7, 2025
17 checks passed
@google-oss-prow google-oss-prow bot added this to the v2.0 milestone May 7, 2025
@kramaranya kramaranya deleted the separate-models-from-sdk branch May 12, 2025 09:54
szaher pushed a commit to szaher/sdk that referenced this pull request Jun 4, 2025
szaher pushed a commit to szaher/sdk that referenced this pull request Jun 4, 2025
szaher pushed a commit to szaher/sdk that referenced this pull request Jun 4, 2025
szaher pushed a commit to szaher/sdk that referenced this pull request Jun 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Separate API Models from SDK Package
6 participants
0