[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Latest commit

 

History

History

docs

QuickStart

In case you do not want to use our 1Click-HPC Cloudformation template, but you still want to build your cluster with all the components and modules available in thie reporitory, you can follow the instruction below to configure your ParallelCluster configuration file. You can create a new cluster using your existing configuration file and just add the following parameters, everything will be installed and configured automatically.
If this is your first approach to AWS ParallelCluster, either go back to the section above or follow all the steps of our Workshop and include the following configuration:

[cluster yourcluster]
...
post_install = https://raw.githubusercontent.com/aws-samples/1click-hpc/main/scripts/post.install.sh
post_install_args = "05.install.ldap.server.headnode.sh 06.install.ldap.client.compute.sh 06.install.ldap.client.headnode.sh 10.install.enginframe.headnode.sh 11.install.ldap.enginframe.headnode.sh 20.install.dcv.slurm.headnode.sh 25.install.dcv-server.compute.sh 35.install.dcv.slurm.compute.sh"
extra_json = {"post_install":{"enginframe":{"ef_admin_pass":"Put_Your_Password_HERE"}}}
tags = {"EnginFrame" : "true"}
...
Note: You need to specify a custom Security Group (that allows inbound connection to the port 8443) defined as `additional_sg` parameter in the `[VPC]` section of your AWS ParallelCluster config file.

(Optional) QuickStart parameters customization

In addition to the Quickstart deployment, there are a few parameters that you can optionally define to customize the components installed.
These parameters are defined as part of the extra_json parameter in the cluster section of the AWS ParallelCluster configuration file. If the extra_json is not specified, all the components will be installed using the default values.
See below a example:

{   
  "post_install": {
    "enginframe": {
      "nice_root": "/fsx/nice",
      "ef_admin": "ec2-user",
      "ef_conf_root": "/fsx/nice/enginframe/conf",
      "ef_data_root": "/fsx/nice/enginframe/data",
      "ef_spooler": "/fsx/nice/enginframe/spoolers",
      "ef_repository": "/fsx/nice/enginframe/repository",
      "ef_admin_pass": "Change_this!"
    },
    "dcvsm": {
      "agent_broker_port": 8445,
      "broker_ca": "/home/ec2-user/dcvsmbroker_ca.pem",
      "client_broker_port": 8446
    },
    "dcv": {
      "dcv_queue_keyword": "dcv"
    }
  }
}
  • nice_root by default ${SHARED_FS_DIR}/nice , is the base directory where EnginFrame is installed.
  • ef_admin by default ec2-user , is the EnginFrame user with administrative rights.
  • ef_conf_root by default ${NICE_ROOT}/enginframe/conf, is the path of the EnginFrame configuration directory.
  • ef_data_root by default ${NICE_ROOT}/enginframe/data, is the path of the EnginFrame data directory.
  • ef_spooler by default ${NICE_ROOT}/enginframe/spoolers, is the path of the EnginFrame Spoolers. Please consider that the Spoolers are the loaction where your jobs are executed.
  • ef_repository by default ${NICE_ROOT}/enginframe/repository, is the EnginFrame repository directory path.
  • ef_admin_pass by default Change_this! , is the EnginFrame admin password. Use this user and pass for your first login into EnginFrame.
  • agent_broker_port by default 8445, is the DCV Session Manager Broker port.
  • broker_ca by default /home/ec2-user/dcvsmbroker_ca.pem, is the location for the DCV Session Manager Broker certificate.
  • client_broker_port by default 8446 , is the DCV Session Manager Broker port used by the client.
  • dcv_queue_keyword by default dcv , is a keyword that identifies the queues of your cluster where you want to enable DCV.

Note: Because of the extra_json is a parameter in a .ini file, you need to put your custom json on a single line. You can use the following command to convert your json into a one-line json:

tr -d '\n' < your_extra.json

See below an example output.

{ "post_install": { "enginframe": { "nice_root": "/fsx/nice", "ef_admin": "ec2-user", "ef_conf_root": "/fsx/nice/enginframe/conf", "ef_data_root": "/fsx/nice/enginframe/data", "ef_spooler": "/fsx/nice/enginframe/spoolers", "ef_repository": "/fsx/nice/enginframe/repository", "ef_admin_pass": "Change_this!" }, "dcvsm": { "agent_broker_port": 8445, "broker_ca": "/home/ec2-user/dcvsmbroker_ca.pem", "client_broker_port": 8446 }, "dcv": { "dcv_queue_keyword": "dcv" }}}

(Optional) Launch script customization

An additional way to further customize the installation and configuration of your components is by downlaoding the scripts locally, modify them, and put them back onto S3.

export S3_BUCKET=<YOUR_S3_BUCKET>

aws s3 cp --quiet --recursive 1click-hpc/scripts/         s3://$S3_BUCKET/scripts/
aws s3 cp --quiet --recursive 1click-hpc/packages/        s3://$S3_BUCKET/packages/
aws s3 cp --quiet --recursive 1click-hpc/parallelcluster/ s3://$S3_BUCKET/parallelcluster/
aws s3 cp --quiet --recursive 1click-hpc/enginframe/      s3://$S3_BUCKET/enginframe/

In this case, your AWS ParallelCluster configuration file has the following parameteres:

post_install = s3://<YOUR_S3_BUCKET>/scripts/post.install.sh
post_install_args = "01.install.enginframe.headnode.sh 03.install.dcv.slurm.headnode.sh 04.install.dcv-server.compute.sh 06.install.dcv.slurm.compute.sh"

The first one, post_install, specifies the S3 bucket you choose to store your post_install bash script. This is the main script that will run all the secondary scripts for installing EnginFrame, DCV Session Manager, DCV Server, and other components.
The second parameter, post_install_args, contains the scripts being launched for installing the selected components.
EnginFrame and DCV Session Manager Broker, and all the other secondary scripts are build indipendently, so you can potentially install just one of them.

Note: This procedure has been tested with EnginFrame version 2020.0 and DCV Session Manager Broker version 2020.2. With easy modifications, though, it can work with previous versions, just mind to add the license management.

Requirements

To perform a successful installation of EnginFrame and DCV Sesssion Manager broker, you’ll need:
  • An S3 bucket, made accessible to ParallelCluster via its s3_read_resource or s3_read_write_resource [cluster] settings. Refer to ParallelCluster configuration for details.
  • An EnginFrame efinstall.config file, containing the desired settings for EnginFrame installation. This enables post-install script to install EnginFrame in unattended mode. An example efinstall.config is provided in this post code: You an review and modify it according to your preferences.
    Alternatively, you can generate your own one by performing an EnginFrame installation: in this case an efinstall.config containing all your choices will be generated in the folder where you ran the installation.
  • A security group allowing EnginFrame inbound port. By default ParallelCluster creates a new security group with just port 22 publicly opened, so you can either use a replacement (via ParallelCluster vpc_security_group_id setting) or add an additional security group (additional_sg setting). In this post I’ll specify an additional security group.
  • ParallelCluster configuration including post_install and post_install_args as mentioned above and described later with more details
  • (optionally) EnginFrame and DCV Session Manager packages, available online from https://download.enginframe.com. Having them in the bucket avoids the need for outgoing internet access for your ParallelCluster headnode to download them. In this article I’ll instead have them copied into my target S3 bucket. My scripts will copy them from S3 to the headnode node.
Note: neither EnginFrame 2020 or DCV Session Manager Broker need a license if running on EC2 instances. For more details please refer to their documentation.

Troubleshooting

Detailed output log is available on the headnode node, in:
  • /var/log/cfn-init.log
  • /var/log/cfn-init-cmd.log
You can reach it via ssh, after getting the headnode node IP address from AWS Console → EC2 → Instances and looking for an instance named HeadNode.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.