Tags: aws/sagemaker-hyperpod-cli
Tags
Support recipes and scheduler in Hyperpod CLI (#41) * add recipes feature for distributed training * improve unit test coverage for recipes feature * add support recipes along with command line args * add recipes * Crescendo helm chart for role and rolebinding (#17) * update the helm chart to create team level roles and bindings * revert unrelated changes * Rename quotaAllocationTarget to computeQuotaTarget * remove kueue related resources from helm chart * Remove parameters of kueue from chart * flip the team role creation to false * Revise readme to add instructions to create the role and binding * add changelog for distributed training * change to public submodules * QuotaAllocation support for Hyperpod CLI (#12) * QuotaAllocation support for Hyperpod CLI --------- Co-authored-by: Amazon GitHub Automation <54958958+amazon-auto@users.noreply.github.com> Co-authored-by: Song Jiang <jiangsongbz@gmail.com> Co-authored-by: Baiyang Li <baiyanl@amazon.com> Co-authored-by: baiyli <105086653+baiyli@users.noreply.github.com> * Remove custom_launcher folder * sync with mainline --------- Co-authored-by: cansun <80425164+can-sun@users.noreply.github.com> Co-authored-by: Amazon GitHub Automation <54958958+amazon-auto@users.noreply.github.com> Co-authored-by: Song Jiang <jiangsongbz@gmail.com> Co-authored-by: Baiyang Li <baiyanl@amazon.com> Co-authored-by: baiyli <105086653+baiyli@users.noreply.github.com> Co-authored-by: Can Sun <sucan@amazon.com>