8000 GitHub - starsflows/Index_EAB
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

starsflows/Index_EAB

{"props":{"initialPayload":{"allShortcutsEnabled":false,"path":"/","repo":{"id":965872863,"defaultBranch":"main","name":"Index_EAB","ownerLogin":"starsflows","currentUserCanPush":false,"isFork":true,"isEmpty":false,"createdAt":"2025-04-14T03:39:35.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/87289239?v=4","public":true,"private":false,"isOrgOwned":false},"currentUser":null,"refInfo":{"name":"main","listCacheKey":"v0:1744601981.979904","canEdit":false,"refType":"branch","currentOid":"d62b0f7f79fad92d48d10b79e66c8b6071a88f9f"},"tree":{"items":[{"name":"configuration_loader","path":"configuration_loader","contentType":"directory"},{"name":"index_advisor_selector","path":"index_advisor_selector","contentType":"directory"},{"name":"workload_generator","path":"workload_generator","contentType":"directory"},{"name":"README.md","path":"README.md","contentType":"file"},{"name":"pypi_package.png","path":"pypi_package.png","contentType":"file"},{"name":"requirements.txt","path":"requirements.txt","contentType":"file"},{"name":"test_eab_package.py","path":"test_eab_package.py","contentType":"file"}],"templateDirectorySuggestionUrl":null,"readme":null,"totalCount":7,"showBranchInfobar":true},"fileTree":null,"fileTreeProcessingTime":null,"foldersToFetch":[],"treeExpanded":false,"symbolsExpanded":false,"isOverview":true,"overview":{"banners":{"shouldRecommendReadme":false,"isPersonalRepo":false,"showUseActionBanner":false,"actionSlug":null,"actionId":null,"showProtectBranchBanner":false,"publishBannersInfo":{"dismissActionNoticePath":"/settings/dismiss-notice/publish_action_from_repo","releasePath":"/starsflows/Index_EAB/releases/new?marketplace=true","showPublishActionBanner":false},"interactionLimitBanner":null,"showInvitationBanner":false,"inviterName":null,"actionsMigrationBannerInfo":{"releaseTags":[],"showImmutableActionsMigrationBanner":false,"initialMigrationStatus":null}},"codeButton":{"contactPath":"/contact","isEnterprise":false,"local":{"protocolInfo":{"httpAvailable":true,"sshAvailable":null,"httpUrl":"https://github.com/starsflows/Index_EAB.git","showCloneWarning":null,"sshUrl":null,"sshCertificatesRequired":null,"sshCertificatesAvailable":null,"ghCliUrl":"gh repo clone starsflows/Index_EAB","defaultProtocol":"http","newSshKeyUrl":"/settings/ssh/new","setProtocolPath":"/users/set_protocol"},"platformInfo":{"cloneUrl":"https://desktop.github.com","showVisualStudioCloneButton":false,"visualStudioCloneUrl":"https://windows.github.com","showXcodeCloneButton":false,"xcodeCloneUrl":"xcode://clone?repo=https%3A%2F%2Fgithub.com%2Fstarsflows%2FIndex_EAB","zipballUrl":"/starsflows/Index_EAB/archive/refs/heads/main.zip"}},"newCodespacePath":"/codespaces/new?hide_repo_select=true\u0026repo=965872863"},"popovers":{"rename":null,"renamedParentRepo":null},"commitCount":"4","overviewFiles":[{"displayName":"README.md","repoName":"Index_EAB","refName":"main","path":"README.md","preferredFileType":"readme","tabName":"README","richText":"\u003carticle class=\"markdown-body entry-content container-lg\" itemprop=\"text\"\u003e\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch1 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eIndex Advisor [Experiment, Analysis \u0026amp; Benchmark]\u003c/h1\u003e\u003ca id=\"user-content-index-advisor-experiment-analysis--benchmark\" class=\"anchor\" aria-label=\"Permalink: Index Advisor [Experiment, Analysis \u0026amp; Benchmark]\" href=\"#index-advisor-experiment-analysis--benchmark\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003e\u003cstrong\u003eNote that the link of the new actively maintained repository is:\u003c/strong\u003e\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e\u003ca href=\"https://github.com/Beliefuture/Index_EAB/\"\u003ehttps://github.com/Beliefuture/Index_EAB/\u003c/a\u003e\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e\u003cstrong\u003ePlease \"STAR\" the new repository if you find this project useful!\u003c/strong\u003e\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eThis is the code respository of the testbed proposed in the \u003cstrong\u003eIndex Advisor (EA\u0026amp;B)\u003c/strong\u003e paper, which conducts a comprehensive assessment of the heuristic-based and the learning-based index advisors.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e*\u003cstrong\u003eNote that we have released a python package about this testbed at the official open-source website: \u003cem\u003e\u003ca href=\"https://pypi.org/project/index-eab/0.1.0/\" rel=\"nofollow\"\u003eindex-eab · PyPI\u003c/a\u003e\u003c/em\u003e. Therefore, you can use the follwoing command to download it with little effort!\u003c/strong\u003e\u003c/p\u003e\n\u003cdiv class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"pip install index-eab==0.1.0\"\u003e\u003cpre\u003epip install index-eab==0.1.0\u003c/pre\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003e\u003ca target=\"_blank\" rel=\"noopener noreferrer\" href=\"/starsflows/Index_EAB/blob/main/pypi_package.png\"\u003e\u003cimg src=\"/starsflows/Index_EAB/raw/main/pypi_package.png\" alt=\"image-20240107023318952\" style=\"max-width: 100%;\"\u003e\u003c/a\u003e\u003c/p\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eProject Structure\u003c/h2\u003e\u003ca id=\"user-content-project-structure\" class=\"anchor\" aria-label=\"Permalink: Project Structure\" href=\"#project-structure\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eSpecifically, the testbed is comprised of three modules.\u003c/p\u003e\n\u003cul dir=\"auto\"\u003e\n\u003cli\u003e\u003cstrong\u003e(1) Configuration Loader:\u003c/strong\u003e initializes a series of evaluation settings, including the benchmark, the index advisor, and the database;\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003e(2) Workload Generator:\u003c/strong\u003e supports three methods for generating workloads with diverse features (e.g., query changes due to typical workload drifts) to simulate the requirements posed by various scenarios;\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003e(3) Index Advisor Selector:\u003c/strong\u003e implements existing index advisors, including seven heuristic-based index advisors and ten learning-based index advisors.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp dir=\"auto\"\u003eThe overall code structure of our \u003cstrong\u003eIndex Advisor (EA\u0026amp;B)\u003c/strong\u003e project, where the critical files are marked with additional comments.\u003c/p\u003e\n\u003cdiv class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"Index_EAB/\n├── configuration_loader\t\t\t \t\t\t\t\t\t\t\t\t\t\t# Module 1: the evaluation settings \n│ ├── becnhmark\n│ ├── index_advisor\n│ │ ├── heu_run_conf\n│ │ │ ├──xxxx_config.json # configurations of heuristic-based index advisors\n│ │ │ └── ...\n│ │ └── rl_run_conf\n│ │ │ ├──xxxx_config.json # configurations of learning-based index advisors\n│ │ │ └── ...\n│ └── database\n│ │ ├── db_con.conf # configurations of database connection\n│ │ └── ...\n├── workload_generator\t\t\t\t \t\t\t\t\t\t\t\t\t\t\t# Module 2: the testing workloads\n│ ├── template_based # workload from template-based generation\n│ ├── perturbation_based # workload from perturbation-based generation\n│ │ ├── perturb_utils\n│ │ └── ...\n│ ├── random # workload from random generation\n│ └── gen_workload.py\n├── index_advisor_selector\t\t\t \t\t\t\t\t\t\t\t\t\t\t# Module 3: the implemented index advisors\n│ ├── index_candidate_generation\n│ │ └── distill_model\n│ │ ├── distill_utils\n│ │ │\t├── distill_com.py # parameters of learned filter model\n│ │ │\t└── ...\n│ │ └── ...\n│ ├── index_selection\n│ │ ├── heu_selection\n│ │ │ ├── heu_utils\n│ │ │ │ ├── heu_com.py # parameters of heuristic-based index advisors\n│ │ │ │ └── ... \n│ │ │ ├── heu_algos\n│ │ │ ├── heu_run.py # entrance of heuristic-based index advisors\n│ │ │ └── ...\n│ │ ├── dqn_selection\n│ │ │ └── dqn_utils\n│ │ │ ├── dqn_run.py # inference entrance of learning-based index advisors\n│ │ │ └── ...\n│ │ ├── swirl_selection\n│ │ │ ├── swirl_utils\n│ │ │ │ ├── swirl_com.py # parameters of learning-based index advisors\n│ │ │ │ └── ... \n│ │ │ ├── gym_db\n│ │ │ ├── stable_baselines\n│ │ │ ├── swirl_main.py # training entrance of learning-based index advisors\n│ │ │ ├── swirl_run.py # inference entrance of learning-based index advisors\n│ │ │ └── ...\n│ │ ├── mab_selection\n│ │ │ ├── bandits\n│ │ │ ├── simualtion\n│ │ │ ├── shared\n│ │ │ │ ├── mab_com.py # parameters of learning-based index advisors\n│ │ │ │ └── ...\n│ │ │ ├── database\n│ │ │ ├── mab_run.py # inference entrance of learning-based index advisors\n│ │ │ └── ... \n│ │ └── mcts_selection\n│ │ ├── mcts_utils\n│ │ │ ├── mcts_com.py # parameters of learning-based index advisors\n│ │ │ └── ... \n│ │ ├── mcts_run.py # inference entrance of learning-based index advisors\n│ │ └── ... \n└── ├── index_benefit_estimation\n └── ├── benefit_utils\n ├── optmizer_cost\n │ ├── optimizer_utils\n │ │ ├── optimizer_com.py # parameters of statistic-based method\n │ │ └── ... \n │ ├── optimizer_train.py # training entrance of statistic-based method\n │ ├── optimizer_infer.py # inference entrance of statistic-based method\n │ └── ... \n ├── tree_model\n │ ├── tree_cost_utils\n │ │ ├── tree_cost_com.py # parameters of learned estimation model\n │ │ └── ... \n │ ├── tree_cost_main.py # training entrance of learned estimation model\n │ ├── tree_cost_infer.py # inference entrance of learned estimation model\n │ └── ... \n ├── index_cost_lib\n │ ├── lib_train.py # training entrance of learned estimation model\n │ ├── lib_infer.py # inference entrance of learned estimation model\n │ └── ... \n └── query_former\n ├── former_train.py # training entrance of learned estimation model\n ├── former_infer.py # inference entrance of learned estimation model\n └── ... \"\u003e\u003cpre class=\"notranslate\"\u003e\u003ccode\u003eIndex_EAB/\n├── configuration_loader\t\t\t \t\t\t\t\t\t\t\t\t\t\t# Module 1: the evaluation settings \n│ ├── becnhmark\n│ ├── index_advisor\n│ │ ├── heu_run_conf\n│ │ │ ├──xxxx_config.json # configurations of heuristic-based index advisors\n│ │ │ └── ...\n│ │ └── rl_run_conf\n│ │ │ ├──xxxx_config.json # configurations of learning-based index advisors\n│ │ │ └── ...\n│ └── database\n│ │ ├── db_con.conf # configurations of database connection\n│ │ └── ...\n├── workload_generator\t\t\t\t \t\t\t\t\t\t\t\t\t\t\t# Module 2: the testing workloads\n│ ├── template_based # workload from template-based generation\n│ ├── perturbation_based # workload from perturbation-based generation\n│ │ ├── perturb_utils\n│ │ └── ...\n│ ├── random # workload from random generation\n│ └── gen_workload.py\n├── index_advisor_selector\t\t\t \t\t\t\t\t\t\t\t\t\t\t# Module 3: the implemented index advisors\n│ ├── index_candidate_generation\n│ │ └── distill_model\n│ │ ├── distill_utils\n│ │ │\t├── distill_com.py # parameters of learned filter model\n│ │ │\t└── ...\n│ │ └── ...\n│ ├── index_selection\n│ │ ├── heu_selection\n│ │ │ ├── heu_utils\n│ │ │ │ ├── heu_com.py # parameters of heuristic-based index advisors\n│ │ │ │ └── ... \n│ │ │ ├── heu_algos\n│ │ │ ├── heu_run.py # entrance of heuristic-based index advisors\n│ │ │ └── ...\n│ │ ├── dqn_selection\n│ │ │ └── dqn_utils\n│ │ │ ├── dqn_run.py # inference entrance of learning-based index advisors\n│ │ │ └── ...\n│ │ ├── swirl_selection\n│ │ │ ├── swirl_utils\n│ │ │ │ ├── swirl_com.py # parameters of learning-based index advisors\n│ │ │ │ └── ... \n│ │ │ ├── gym_db\n│ │ │ ├── stable_baselines\n│ │ │ ├── swirl_main.py # training entrance of learning-based index advisors\n│ │ │ ├── swirl_run.py # inference entrance of learning-based index advisors\n│ │ │ └── ...\n│ │ ├── mab_selection\n│ │ │ ├── bandits\n│ │ │ ├── simualtion\n│ │ │ ├── shared\n│ │ │ │ ├── mab_com.py # parameters of learning-based index advisors\n│ │ │ │ └── ...\n│ │ │ ├── database\n│ │ │ ├── mab_run.py # inference entrance of learning-based index advisors\n│ │ │ └── ... \n│ │ └── mcts_selection\n│ │ ├── mcts_utils\n│ │ │ ├── mcts_com.py # parameters of learning-based index advisors\n│ │ │ └── ... \n│ │ ├── mcts_run.py # inference entrance of learning-based index advisors\n│ │ └── ... \n└── ├── index_benefit_estimation\n └── ├── benefit_utils\n ├── optmizer_cost\n │ ├── optimizer_utils\n │ │ ├── optimizer_com.py # parameters of statistic-based method\n │ │ └── ... \n │ ├── optimizer_train.py # training entrance of statistic-based method\n │ ├── optimizer_infer.py # inference entrance of statistic-based method\n │ └── ... \n ├── tree_model\n │ ├── tree_cost_utils\n │ │ ├── tree_cost_com.py # parameters of learned estimation model\n │ │ └── ... \n │ ├── tree_cost_main.py # training entrance of learned estimation model\n │ ├── tree_cost_infer.py # inference entrance of learned estimation model\n │ └── ... \n ├── index_cost_lib\n │ ├── lib_train.py # training entrance of learned estimation model\n │ ├── lib_infer.py # inference entrance of learned estimation model\n │ └── ... \n └── query_former\n ├── former_train.py # training entrance of learned estimation model\n ├── former_infer.py # inference entrance of learned estimation model\n └── ... \n\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eSetup\u003c/h2\u003e\u003ca id=\"user-content-setup\" class=\"anchor\" aria-label=\"Permalink: Setup\" href=\"#setup\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eWe introduce the indispensable step, i.e., experiment setup for the experiment evaluations, you should check the following things:\u003c/p\u003e\n\u003cul dir=\"auto\"\u003e\n\u003cli\u003eCreate \u003cstrong\u003ethe database instance\u003c/strong\u003e according to the provided toolkit;\u003c/li\u003e\n\u003cli\u003eCreate \u003cstrong\u003ethe HypoPG extension\u003c/strong\u003e on the database instance for the usage of hypothetical index according to \u003ca href=\"https://github.com/HypoPG/hypopg\"\u003eHypoPG/hypopg: Hypothetical Indexes for PostgreSQL (github.com)\u003c/a\u003e;\u003c/li\u003e\n\u003cli\u003eCreate \u003cstrong\u003ethe python virtual environment\u003c/strong\u003e. Specifically, you can utilize the following script and the corresponding file \u003ccode\u003erequirements.txt\u003c/code\u003e is provided under the main directory. Please check the packages required are properly installed.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cdiv class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"# Create the virtualenv `TRAP`\nconda create -n Index_EAB python=3.7\t\t \t\n\n# Activate the virtualenv `TRAP`\nconda activate Index_EAB\t\t\t\t\n\n# Install requirements with pip\nwhile read requirement; do pip install $requirement; done \u0026lt; requirements.txt\t\"\u003e\u003cpre\u003e\u003cspan class=\"pl-c\"\u003e\u003cspan class=\"pl-c\"\u003e#\u003c/span\u003e Create the virtualenv `TRAP`\u003c/span\u003e\nconda create -n Index_EAB python=3.7\t\t \t\n\n\u003cspan class=\"pl-c\"\u003e\u003cspan class=\"pl-c\"\u003e#\u003c/span\u003e Activate the virtualenv `TRAP`\u003c/span\u003e\nconda activate Index_EAB\t\t\t\t\n\n\u003cspan class=\"pl-c\"\u003e\u003cspan class=\"pl-c\"\u003e#\u003c/span\u003e Install requirements with pip\u003c/span\u003e\n\u003cspan class=\"pl-k\"\u003ewhile\u003c/span\u003e \u003cspan class=\"pl-c1\"\u003eread\u003c/span\u003e requirement\u003cspan class=\"pl-k\"\u003e;\u003c/span\u003e \u003cspan class=\"pl-k\"\u003edo\u003c/span\u003e pip install \u003cspan class=\"pl-smi\"\u003e$requirement\u003c/span\u003e\u003cspan class=\"pl-k\"\u003e;\u003c/span\u003e \u003cspan class=\"pl-k\"\u003edone\u003c/span\u003e \u003cspan class=\"pl-k\"\u003e\u0026lt;\u003c/span\u003e requirements.txt\t\u003c/pre\u003e\u003c/div\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eTestbed Workflow\u003c/h2\u003e\u003ca id=\"user-content-testbed-workflow\" class=\"anchor\" aria-label=\"Permalink: Testbed Workflow\" href=\"#testbed-workflow\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch3 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003e1. Configuration Setup\u003c/h3\u003e\u003ca id=\"user-content-1--configuration-setup\" class=\"anchor\" aria-label=\"Permalink: 1. Configuration Setup\" href=\"#1--configuration-setup\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003ePlease Specify the \u003cstrong\u003econfiguration\u003c/strong\u003e about the benchmark, the index advisor, and the database.\u003c/p\u003e\n\u003cul dir=\"auto\"\u003e\n\u003cli\u003e\n\u003cp dir=\"auto\"\u003e\u003cstrong\u003eBenchmark:\u003c/strong\u003e set the vocabulary (already provided) to generate workloads of \u003cem\u003eQuery Perturbation\u003c/em\u003e\u003c/p\u003e\n\u003c/li\u003e\n\u003cli\u003e\n\u003cp dir=\"auto\"\u003e\u003cstrong\u003eIndex Advisor:\u003c/strong\u003e\u003c/p\u003e\n\u003cul dir=\"auto\"\u003e\n\u003cli\u003eset the configurations for the \u003cstrong\u003eheuristic-based\u003c/strong\u003e index advisors at \u003ccode\u003e/configuration_loader/heu_run_conf\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003eset the configurations for the \u003cstrong\u003elearning-based\u003c/strong\u003e index advisors at \u003ccode\u003e/configuration_loader/rl_run_conf\u003c/code\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003c/li\u003e\n\u003c/ul\u003e\n\u003cmarkdown-accessiblity-table\u003e\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003eParameter\u003c/th\u003e\n\u003cth\u003eDescription\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd\u003econstraint\u003c/td\u003e\n\u003ctd\u003eThe constraint of the budget type (\u003ccode\u003estorage\u003c/code\u003e by default)\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003ebudget_MB\u003c/td\u003e\n\u003ctd\u003eThe constraint of the storage budget (MB) (\u003ccode\u003e500\u003c/code\u003e by default, valid when \u003ccode\u003econstraint = storage\u003c/code\u003e)\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003emax_indexes\u003c/td\u003e\n\u003ctd\u003eThe constraint of the maximum allowable number (\u003ccode\u003e5\u003c/code\u003e by default, valid when \u003ccode\u003econstraint = number\u003c/code\u003e)\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003emax_index_width\u003c/td\u003e\n\u003ctd\u003eThe constraint of the index width over the considered index candidates (\u003ccode\u003e2\u003c/code\u003e by default)\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\u003c/markdown-accessiblity-table\u003e\n\u003cp dir=\"auto\"\u003eThe parameters above are the basic configurations of index advisors. More illustrations about the fine-grained parameters (e.g., the method utilized in each underlying building block) are presented in the running script in \u003cem\u003e\u003cstrong\u003eStep 3. Index Advisor Evaluation\u003c/strong\u003e\u003c/em\u003e.\u003c/p\u003e\n\u003cul dir=\"auto\"\u003e\n\u003cli\u003e\u003cstrong\u003eDatabase:\u003c/strong\u003e set the configurations at \u003ccode\u003econfiguration_loader/databse/db_con.conf\u003c/code\u003e for the connection \u003cstrong\u003eto your own database instance\u003c/strong\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003cdiv class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"host = -- your host --\nport = -- your port --\nuser = -- your user --\npassword = -- your password --\ndatabase = -- your database --\"\u003e\u003cpre class=\"notranslate\"\u003e\u003ccode\u003ehost = -- your host --\nport = -- your port --\nuser = -- your user --\npassword = -- your password --\ndatabase = -- your database --\n\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eApart from that, we provide some files including the statistics of the database / benchmark in the directory. For example, the file \u003ccode\u003econfiguration_loader/database/schema_tpch.json\u003c/code\u003e stores the schema information of the \u003cem\u003eTPC-H\u003c/em\u003e benchmark.\u003c/p\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch3 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003e2. Data Preparation\u003c/h3\u003e\u003ca id=\"user-content-2--data-preparation\" class=\"anchor\" aria-label=\"Permalink: 2. Data Preparation\" href=\"#2--data-preparation\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eThe workload data provided in \u003ccode\u003e/workload_generator\u003c/code\u003e has already been preprocessed, which involves three types of the workloads, i.e., \u003cstrong\u003e(1) template-based\u003c/strong\u003e, \u003cstrong\u003e(2) perturbation-based\u003c/strong\u003e, and \u003cstrong\u003e(3) random\u003c/strong\u003e. These data can be utilized for direct evaluation and you can generate your own workload data organized in the following format.\u003c/p\u003e\n\u003cdiv class=\"highlight highlight-source-sql notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"[\n\t[\n 1,\t\t# query ID\n \u0026quot;SELECT MIN(mc.note) AS production_note, MIN(t.title) AS movie_title ...\u0026quot;,\t# query frequency\n 666\t\t# query frequency\n ],\n ...\n]\"\u003e\u003cpre\u003e[\n\t[\n \u003cspan class=\"pl-c1\"\u003e1\u003c/span\u003e,\t\t\u003cspan class=\"pl-c\"\u003e\u003cspan class=\"pl-c\"\u003e#\u003c/span\u003e query ID\u003c/span\u003e\n \u003cspan class=\"pl-s\"\u003e\u003cspan class=\"pl-pds\"\u003e\"\u003c/span\u003eSELECT MIN(mc.note) AS production_note, MIN(t.title) AS movie_title ...\u003cspan class=\"pl-pds\"\u003e\"\u003c/span\u003e\u003c/span\u003e,\t\u003cspan class=\"pl-c\"\u003e\u003cspan class=\"pl-c\"\u003e#\u003c/span\u003e query frequency\u003c/span\u003e\n \u003cspan class=\"pl-c1\"\u003e666\u003c/span\u003e\t\t\u003cspan class=\"pl-c\"\u003e\u003cspan class=\"pl-c\"\u003e#\u003c/span\u003e query frequency\u003c/span\u003e\n ],\n ...\n]\u003c/pre\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eFor example, you can generate your own perturbation-based workload, i.e., conduct query changes (e.g., add a new selection predicate) over the given workloads using the file \u003ccode\u003e/workload_generator/gen_workload.py\u003c/code\u003e. It currently supports three perturbation manners with different amplitudes that simulate the typical workload drifts introduced in our paper:\u003c/p\u003e\n\u003cul dir=\"auto\"\u003e\n\u003cli\u003e\u003cstrong\u003eValue Only Perturbation:\u003c/strong\u003e modifications on the predicate values of the query templates with placeholders;\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eColumn Consistent Perturbation:\u003c/strong\u003e modifications on the values and the set same of columns (e.g., change the column order in \u003cem\u003eGROUP BY\u003c/em\u003e clause);\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eShared Table Perturbation:\u003c/strong\u003e modifications on the SQL tokens of the same table (e.g., add a new selection predicate).\u003c/li\u003e\n\u003c/ul\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch3 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003e3. Index Advisor Evaluation\u003c/h3\u003e\u003ca id=\"user-content-3--index-advisor-evaluation\" class=\"anchor\" aria-label=\"Permalink: 3. Index Advisor Evaluation\" href=\"#3--index-advisor-evaluation\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eWith the specified configurations in \u003cem\u003e\u003cstrong\u003eStep 1. Configuration Setup\u003c/strong\u003e\u003c/em\u003e and the prepared data in \u003cem\u003e\u003cstrong\u003eStep 2. Data Preparation\u003c/strong\u003e\u003c/em\u003e, we next proceed to the evaluation of different index advisors.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eFor example, we can evaluate heuristic-based index advisors with the following script:\u003c/p\u003e\n\u003cmarkdown-accessiblity-table\u003e\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003eParameter\u003c/th\u003e\n\u003cth\u003eDescritpiton\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd\u003ecand_gen\u003c/td\u003e\n\u003ctd\u003eThe methods utilized in \u003cem\u003e\u003cstrong\u003eIndex Candidate Generation\u003c/strong\u003 10000 e\u003c/em\u003e building block (\u003ccode\u003e\"permutation\" / \"dqn_rule\" / \"openGauss\"\u003c/code\u003e)\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eest_model\u003c/td\u003e\n\u003ctd\u003eThe methods utilized in \u003cem\u003e\u003cstrong\u003eIndex Benefit Estimation\u003c/strong\u003e\u003c/em\u003e building block (\u003ccode\u003e\"optimizer\" / \"tree\" / \"lib\" / \"queryformer\"\u003c/code\u003e)\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eprocess\u003c/td\u003e\n\u003ctd\u003eThe ddetails of the overall process of \u003cem\u003e\u003cstrong\u003eIndex Selection\u003c/strong\u003e\u003c/em\u003e building block\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eoverhead\u003c/td\u003e\n\u003ctd\u003eThe time overhead spent on each building block\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\u003c/markdown-accessiblity-table\u003e\n\u003cdiv class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"python heu_run.py\n\n--res_save /index_advisor_selector/index_selection/heu_res.json\n\n--process --overhead\n\n--sel_params parameters\n--exp_conf_file /configuration_loader/index_advisor/heu_run_conf/{}_config.json\n\n--constraint storage --budget_MB 500\n\n--cand_gen permutation --est_model optimizer\n\n--work_file /workload_generator/template_based/tpch_work_temp_multi.json\n--db_conf_file /configuration_loader/database/db_con.conf\n--schema_file /configuration_loader/database/schema_tpch.json\"\u003e\u003cpre\u003epython heu_run.py\n\n--res_save /index_advisor_selector/index_selection/heu_res.json\n\n--process --overhead\n\n--sel_params parameters\n--exp_conf_file /configuration_loader/index_advisor/heu_run_conf/{}_config.json\n\n--constraint storage --budget_MB 500\n\n--cand_gen permutation --est_model optimizer\n\n--work_file /workload_generator/template_based/tpch_work_temp_multi.json\n--db_conf_file /configuration_loader/database/db_con.conf\n--schema_file /configuration_loader/database/schema_tpch.json\u003c/pre\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eBesides, we can evaluate learning-based index advisors with the following script:\u003c/p\u003e\n\u003cmarkdown-accessiblity-table\u003e\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003eParameter\u003c/th\u003e\n\u003cth\u003eDescritpiton\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd\u003eexp_id\u003c/td\u003e\n\u003ctd\u003eThe experiment ID specified to store the result under \u003ccode\u003e/index_advisor_selector/index_selection/swirl_selection/exp_res\u003c/code\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003ealgo\u003c/td\u003e\n\u003ctd\u003eThe learning-based index advisor to be assessed, i.e., \u003ccode\u003e\"swirl\", \"drlinda\", \"dqn\"\u003c/code\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eworkload_embedder\u003c/td\u003e\n\u003ctd\u003eThe class of the workload representation of the learning-based index advisors\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eobservation_manager\u003c/td\u003e\n\u003ctd\u003eThe class of the state representation of the learning-based index advisors\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eaction_manager\u003c/td\u003e\n\u003ctd\u003eThe class of the action space of the learning-based index advisors\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003ereward_calculator\u003c/td\u003e\n\u003ctd\u003eThe class of the reward function of the learning-based index advisors\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003erl_exp_load\u003c/td\u003e\n\u003ctd\u003eThe configurations of the trained learning-based index advisors\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003erl_model_load\u003c/td\u003e\n\u003ctd\u003eThe agent of the trained learning-based index advisors\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003erl_env_load\u003c/td\u003e\n\u003ctd\u003eThe environment of the trained learning-based index advisors\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\u003c/markdown-accessiblity-table\u003e\n\u003cdiv class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"# Training\n\npython swirl_main.py\n\n--algo swirl --exp_id swirl_tpch_v1 \n--timesteps 100000 --seed 666\n\n--constraint storage --max_budgets 500\n\n--exp_conf_file /configuration_loader/index_advisor/rl_run_conf/swirl_config.json\n\n--work_size 18 --work_gen load \n--work_type not_template --temp_num 22\n\n--training_instances 80 --validation_testing_instances 5\n\n--work_file /workload_generator/template_based/tpch_work_temp_multi.json\n--eval_file /workload_generator/template_based/tpch_work_temp_multi.json\n\n--db_conf_file /configuration_loader/database/db_con.conf\n--schema_file /configuration_loader/database/schema_tpch.json\n--colinfo_load /configuration_loader/database/colinfo_tpch.json\n\n-------------------------------------------------------------------------------------------\n\n# Inference\n\npython swirl_run.py\n\n--algo swirl --seed 666\n\n--constraint storage --max_budgets 500\n\n--rl_exp_load /index_advisor_selector/index_selection/swirl_selection/exp_res/swirl_tpch_v1/experiment_object.pickle\n--rl_model_load /index_advisor_selector/index_selection/swirl_selection/exp_res/swirl_tpch_v1/best_mean_reward_model.zip\n--rl_env_load /index_advisor_selector/index_selection/swirl_selection/exp_res/swirl_tpch_v1/vec_normalize.pkl\n\n--work_file /workload_generator/template_based/tpch_work_temp_multi.json\n\n--db_conf_file /configuration_loader/database/db_con.conf\n\"\u003e\u003cpre\u003e\u003cspan class=\"pl-c\"\u003e\u003cspan class=\"pl-c\"\u003e#\u003c/span\u003e Training\u003c/span\u003e\n\npython swirl_main.py\n\n--algo swirl --exp_id swirl_tpch_v1 \n--timesteps 100000 --seed 666\n\n--constraint storage --max_budgets 500\n\n--exp_conf_file /configuration_loader/index_advisor/rl_run_conf/swirl_config.json\n\n--work_size 18 --work_gen load \n--work_type not_template --temp_num 22\n\n--training_instances 80 --validation_testing_instances 5\n\n--work_file /workload_generator/template_based/tpch_work_temp_multi.json\n--eval_file /workload_generator/template_based/tpch_work_temp_multi.json\n\n--db_conf_file /configuration_loader/database/db_con.conf\n--schema_file /configuration_loader/database/schema_tpch.json\n--colinfo_load /configuration_loader/database/colinfo_tpch.json\n\n-------------------------------------------------------------------------------------------\n\n\u003cspan class=\"pl-c\"\u003e\u003cspan class=\"pl-c\"\u003e#\u003c/span\u003e Inference\u003c/span\u003e\n\npython swirl_run.py\n\n--algo swirl --seed 666\n\n--constraint storage --max_budgets 500\n\n--rl_exp_load /index_advisor_selector/index_selection/swirl_selection/exp_res/swirl_tpch_v1/experiment_object.pickle\n--rl_model_load /index_advisor_selector/index_selection/swirl_selection/exp_res/swirl_tpch_v1/best_mean_reward_model.zip\n--rl_env_load /index_advisor_selector/index_selection/swirl_selection/exp_res/swirl_tpch_v1/vec_normalize.pkl\n\n--work_file /workload_generator/template_based/tpch_work_temp_multi.json\n\n--db_conf_file /configuration_loader/database/db_con.conf\n\u003c/pre\u003e\u003c/div\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eReference\u003c/h2\u003e\u003ca id=\"user-content-reference\" class=\"anchor\" aria-label=\"Permalink: Reference\" href=\"#reference\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003e\u003cstrong\u003eWe sincerely appreciate the authors of the following work for their efforts over the research of index advisors assessed in our work !\u003c/strong\u003e\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[1] Rainer Schlosser, Jan Kossmann, and Martin Boissier. 2019. \u003cem\u003eEfficient Scalable Multi-attribute Index Selection Using Recursive Strategies\u003c/em\u003e. In ICDE. 1238–1249.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[2] Gary Valentin, Michael Zuliani, Daniel C. Zilio, Guy M. Lohman, and Alan Skelley. 2000. \u003cem\u003eDB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes\u003c/em\u003e. In ICDE. 101–110.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[3] Surajit Chaudhuri and Vivek R. Narasayya. 1997. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB. 146–155.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[4] Kyu-Young Whang. 1987. Index Selection in Relational Databases. Foundations of Data Organization (1987), 487–500.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[5] Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. In SIGMOD Conference. 227–238.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[6] S. Chaudhuri and V. Narasayya. 2020. Anytime Algorithm of Database Tuning Advisor for Microsoft SQL Server. \u003ca href=\"https://www.microsoft.com/en-us/research/publication/\" rel=\"nofollow\"\u003ehttps://www.microsoft.com/en-us/research/publication/\u003c/a\u003e.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[7] Alberto Caprara, Matteo Fischetti, and Dario Maio. 1995. Exact and Approximate Algorithms for the Index Selection Problem in Physical Database Design. IEEE Trans. Knowl. Data Eng. 7, 6 (1995), 955–967.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[8] Debabrata Dash, Neoklis Polyzotis, and Anastasia Ailamaki. 2011. CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads. Proc. VLDB Endow. 4, 6 (2011), 362–372.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[9] Jan Kossmann, Alexander Kastius, and Rainer Schlosser. 2022. SWIRL: Selection of Workload-aware Indexes using Reinforcement Learning. In EDBT. 2:155–2:168.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[10] Zahra Sadri, Le Gruenwald, and Eleazar Leal. 2020. DRLindex: deep reinforcement learning index advisor for a cluster database. In IDEAS. 11:1–11:8.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[11] Hai Lan, Zhifeng Bao, and Yuwei Peng. 2020. An Index Advisor Using Deep Reinforcement Learning. In CIKM. 2105–2108.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[12] R. Malinga Perera, Bastian Oetomo, Benjamin I. P. Rubinstein, and Renata Borovica-Gajic. 2021. DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees. In ICDE. 600–611.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[13] Xuanhe Zhou, Luyang Liu, Wenbo Li, Lianyuan Jin, Shifu Li, Tianqing Wang, and Jianhua Feng. 2022. AutoIndex: An Incremental Index Management System for Dynamic Workloads. In ICDE. 2196–2208.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[14] Wentao Wu, Chi Wang, Tarique Siddiqui, Junxiong Wang, Vivek R. Narasayya, Surajit Chaudhuri, and Philip A. Bernstein. 2022. Budget-aware Index Tuning with Reinforcement Learning. In SIGMOD Conference. 1528–1541.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[15] Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2019. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In SIGMOD Conference. 1241–1258.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[16] Tarique Siddiqui, Wentao Wu, Vivek R. Narasayya, and Surajit Chaudhuri. 2022. DISTILL: Low-Overhead Data-Driven Techniques for Filtering and Costing Indexes for Scalable Index Tuning. Proc. VLDB Endow. 15, 10 (2022), 2019–2031.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[17] Jiachen Shi, Gao Cong, and Xiaoli Li. 2022. Learned Index Benefits: Machine Learning Based Index Performance Estimation. Proc. VLDB Endow. 15, 13 (2022), 3950–3962.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003e[18] Yue Zhao, Gao Cong, Jiachen Shi, and Chunyan Miao. 2022. QueryFormer: A Tree Transformer Model for Query Plan Representation. Proc. VLDB Endow. 15, 8 (2022), 1658–1670.\u003c/p\u003e\n\u003c/article\u003e","loaded":true,"timedOut":false,"errorMessage":null,"headerInfo":{"toc":[{"level":1,"text":"Index Advisor [Experiment, Analysis \u0026 Benchmark]","anchor":"index-advisor-experiment-analysis--benchmark","htmlText":"Index Advisor [Experiment, Analysis \u0026amp; Benchmark]"},{"level":2,"text":"Project Structure","anchor":"project-structure","htmlText":"Project Structure"},{"level":2,"text":"Setup","anchor":"setup","htmlText":"Setup"},{"level":2,"text":"Testbed Workflow","anchor":"testbed-workflow","htmlText":"Testbed Workflow"},{"level":3,"text":"1. Configuration Setup","anchor":"1--configuration-setup","htmlText":"1. Configuration Setup"},{"level":3,"text":"2. Data Preparation","anchor":"2--data-preparation","htmlText":"2. Data Preparation"},{"level":3,"text":"3. Index Advisor Evaluation","anchor":"3--index-advisor-evaluation","htmlText":"3. Index Advisor Evaluation"},{"level":2,"text":"Reference","anchor":"reference","htmlText":"Reference"}],"siteNavLoginPath":"/login?return_to=https%3A%2F%2Fgithub.com%2Fstarsflows%2FIndex_EAB"}}],"overviewFilesProcessingTime":0}},"appPayload":{"helpUrl":"https://docs.github.com","findFileWorkerPath":"/assets-cdn/worker/find-file-worker-263cab1760dd.js","findInFileWorkerPath":"/assets-cdn/worker/find-in-file-worker-2e7f7047116e.js","githubDevUrl":null,"enabled_features":{"copilot_workspace":null,"code_nav_ui_events":false,"react_blob_overlay":false,"accessible_code_button":true,"github_models_repo_integration":false}}}}
 
 

Repository files navigation

Index Advisor [Experiment, Analysis & Benchmark]

Note that the link of the new actively maintained repository is:

https://github.com/Beliefuture/Index_EAB/

Please "STAR" the new repository if you find this project useful!

This is the code respository of the testbed proposed in the Index Advisor (EA&B) paper, which conducts a comprehensive assessment of the heuristic-based and the learning-based index advisors.

*Note that we have released a python package about this testbed at the official open-source website: index-eab · PyPI. Therefore, you can use the follwoing command to download it with little effort!

pip install index-eab==0.1.0

image-20240107023318952

Project Structure

Specifically, the testbed is comprised of three modules.

  • (1) Configuration Loader: initializes a series of evaluation settings, including the benchmark, the index advisor, and the database;
  • (2) Workload Generator: supports three methods for generating workloads with diverse features (e.g., query changes due to typical workload drifts) to simulate the requirements posed by various scenarios;
  • (3) Index Advisor Selector: implements existing index advisors, including seven heuristic-based index advisors and ten learning-based index advisors.

The overall code structure of our Index Advisor (EA&B) project, where the critical files are marked with additional comments.

Index_EAB/
├── configuration_loader			 											# Module 1: the evaluation settings                 
│   ├── becnhmark
│   ├── index_advisor
│   │   ├── heu_run_conf
│   │   │   ├──xxxx_config.json  # configurations of heuristic-based index advisors
│   │   │   └── ...
│   │   └── rl_run_conf
│   │   │   ├──xxxx_config.json  # configurations of learning-based index advisors
│   │   │   └── ...
│   └── database
│   │   ├── db_con.conf  # configurations of database connection
│   │   └── ...
├── workload_generator				 											# Module 2: the testing workloads
│   ├── template_based  # workload from template-based generation
│   ├── perturbation_based  # workload from perturbation-based generation
│   │   ├── perturb_utils
│   │   └── ...
│   ├── random  # workload from random generation
│   └── gen_workload.py
├── index_advisor_selector			 											# Module 3: the implemented index advisors
│   ├── index_candidate_generation
│   │   └── distill_model
│   │       ├── distill_utils
│   │       │	├── distill_com.py  # parameters of learned filter model
│   │       │	└── ...
│   │       └── ...
│   ├── index_selection
│   │   ├── heu_selection
│   │   │   ├── heu_utils
│   │   │   │   ├── heu_com.py  # parameters of heuristic-based index ad
8000
visors
│   │   │   │   └── ... 
│   │   │   ├── heu_algos
│   │   │   ├── heu_run.py  # entrance of heuristic-based index advisors
│   │   │   └── ...
│   │   ├── dqn_selection
│   │   │   └── dqn_utils
│   │   │   ├── dqn_run.py  # inference entrance of learning-based index advisors
│   │   │   └── ...
│   │   ├── swirl_selection
│   │   │   ├── swirl_utils
│   │   │   │   ├── swirl_com.py  # parameters of learning-based index advisors
│   │   │   │   └── ... 
│   │   │   ├── gym_db
│   │   │   ├── stable_baselines
│   │   │   ├── swirl_main.py  # training entrance of learning-based index advisors
│   │   │   ├── swirl_run.py  # inference entrance of learning-based index advisors
│   │   │   └── ...
│   │   ├── mab_selection
│   │   │   ├── bandits
│   │   │   ├── simualtion
│   │   │   ├── shared
│   │   │   │   ├── mab_com.py  # parameters of learning-based index advisors
│   │   │   │   └── ...
│   │   │   ├── database
│   │   │   ├── mab_run.py  # inference entrance of learning-based index advisors
│   │   │   └── ... 
│   │   └── mcts_selection
│   │       ├── mcts_utils
│   │       │   ├── mcts_com.py  # parameters of learning-based index advisors
│   │       │   └── ... 
│   │       ├── mcts_run.py  # inference entrance of learning-based index advisors
│   │       └── ... 
└── ├── index_benefit_estimation
    └── ├── benefit_utils
        ├── optmizer_cost
        │   ├── optimizer_utils
        │   │   ├── optimizer_com.py  # parameters of statistic-based method
        │   │   └── ... 
        │   ├── optimizer_train.py  # training entrance of statistic-based method
        │   ├── optimizer_infer.py  # inference entrance of statistic-based method
        │   └── ... 
        ├── tree_model
        │   ├── tree_cost_utils
        │   │   ├── tree_cost_com.py  # parameters of learned estimation model
        │   │   └── ... 
        │   ├── tree_cost_main.py  # training entrance of learned estimation model
        │   ├── tree_cost_infer.py  # inference entrance of learned estimation model
        │   └── ... 
        ├── index_cost_lib
        │   ├── lib_train.py  # training entrance of learned estimation model
        │   ├── lib_infer.py  # inference entrance of learned estimation model
        │   └── ... 
        └── query_former
            ├── former_train.py  # training entrance of learned estimation model
            ├── former_infer.py  # inference entrance of learned estimation model
            └── ... 

Setup

We introduce the indispensable step, i.e., experiment setup for the experiment evaluations, you should check the following things:

  • Create the database instance according to the provided toolkit;
  • Create the HypoPG extension on the database instance for the usage of hypothetical index according to HypoPG/hypopg: Hypothetical Indexes for PostgreSQL (github.com);
  • Create the python virtual environment. Specifically, you can utilize the following script and the corresponding file requirements.txt is provided under the main directory. Please check the packages required are properly installed.
# Create the virtualenv `TRAP`
conda create -n Index_EAB python=3.7		 	

# Activate the virtualenv `TRAP`
conda activate Index_EAB				

# Install requirements with pip
while read requirement; do pip install $requirement; done < requirements.txt	

Testbed Workflow

1. Configuration Setup

Please Specify the configuration about the benchmark, the index advisor, and the database.

  • Benchmark: set the vocabulary (already provided) to generate workloads of Query Perturbation

  • Index Advisor:

    • set the configurations for the heuristic-based index advisors at /configuration_loader/heu_run_conf
    • set the configurations for the learning-based index advisors at /configuration_loader/rl_run_conf
Parameter Description
constraint The constraint of the budget type (storage by default)
budget_MB The constraint of the storage budget (MB) (500 by default, valid when constraint = storage)
max_indexes The constraint of the maximum allowable number (5 by default, valid when constraint = number)
max_index_width The constraint of the index width over the considered index candidates (2 by default)

The parameters above are the basic configurations of index advisors. More illustrations about the fine-grained parameters (e.g., the method utilized in each underlying building block) are presented in the running script in Step 3. Index Advisor Evaluation.

  • Database: set the configurations at configuration_loader/databse/db_con.conf for the connection to your own database instance
host = -- your host --
port = -- your port --
user = -- your user --
password = -- your password --
database = -- your database --

Apart from that, we provide some files including the statistics of the database / benchmark in the directory. For example, the file configuration_loader/database/schema_tpch.json stores the schema information of the TPC-H benchmark.

2. Data Preparation

The workload data provided in /workload_generator has already been preprocessed, which involves three types of the workloads, i.e., (1) template-based, (2) perturbation-based, and (3) random. These data can be utilized for direct evaluation and you can generate your own workload data organized in the following format.

[
	[
        1,		# query ID
        "SELECT MIN(mc.note) AS production_note, MIN(t.title) AS movie_title ...",	# query frequency
        666		# query frequency
    ],
    ...
]

For example, you can generate your own perturbation-based workload, i.e., conduct query changes (e.g., add a new selection predicate) over the given workloads using the file /workload_generator/gen_workload.py. It currently supports three perturbation manners with different amplitudes that simulate the typical workload drifts introduced in our paper:

  • Value Only Perturbation: modifications on the predicate values of the query templates with placeholders;
  • Column Consistent Perturbation: modifications on the values and the set same of columns (e.g., change the column order in GROUP BY clause);
  • Shared Table Perturbation: modifications on the SQL tokens of the same table (e.g., add a new selection predicate).

3. Index Advisor Evaluation

With the specified configurations in Step 1. Configuration Setup and the prepared data in Step 2. Data Preparation, we next proceed to the evaluation of different index advisors.

For example, we can evaluate heuristic-based index advisors with the following script:

Parameter Descritpiton
cand_gen The methods utilized in Index Candidate Generation building block ("permutation" / "dqn_rule" / "openGauss")
est_model The methods utilized in Index Benefit Estimation building block ("optimizer" / "tree" / "lib" / "queryformer")
process The ddetails of the overall process of Index Selection building block
overhead The time overhead spent on each building block
python heu_run.py

--res_save /index_advisor_selector/index_selection/heu_res.json

--process --overhead

--sel_params parameters
--exp_conf_file /configuration_loader/index_advisor/heu_run_conf/{}_config.json

--constraint storage --budget_MB 500

--cand_gen permutation --est_model optimizer

--work_file /workload_generator/template_based/tpch_work_temp_multi.json
--db_conf_file /configuration_loader/database/db_con.conf
--schema_file /configuration_loader/database/schema_tpch.json

Besides, we can evaluate learning-based index advisors with the following script:

Parameter Descritpiton
exp_id The experiment ID specified to store the result under /index_advisor_selector/index_selection/swirl_selection/exp_res
algo The learning-based index advisor to be assessed, i.e., "swirl", "drlinda", "dqn"
workload_embedder The class of the workload representation of the learning-based index advisors
observation_manager The class of the state representation of the learning-based index advisors
action_manager The class of the action space of the learning-based index advisors
reward_calculator The class of the reward function of the learning-based index advisors
rl_exp_load The configurations of the trained learning-based index advisors
rl_model_load The agent of the trained learning-based index advisors
rl_env_load The environment of the trained learning-based index advisors
# Training

python swirl_main.py

--algo swirl --exp_id swirl_tpch_v1 
--timesteps 100000 --seed 666

--constraint storage --max_budgets 500

--exp_conf_file /configuration_loader/index_advisor/rl_run_conf/swirl_config.json

--work_size 18 --work_gen load 
--work_type not_template --temp_num 22

--training_instances 80 --validation_testing_instances 5

--work_file /workload_generator/template_based/tpch_work_temp_multi.json
--eval_file /workload_generator/template_based/tpch_work_temp_multi.json

--db_conf_file /configuration_loader/database/db_con.conf
--schema_file /configuration_loader/database/schema_tpch.json
--colinfo_load /configuration_loader/database/colinfo_tpch.json

-------------------------------------------------------------------------------------------

# Inference

python swirl_run.py

--algo swirl --seed 666

--constraint storage --max_budgets 500

--rl_exp_load /index_advisor_selector/index_selection/swirl_selection/exp_res/swirl_tpch_v1/experiment_object.pickle
--rl_model_load /index_advisor_selector/index_selection/swirl_selection/exp_res/swirl_tpch_v1/best_mean_reward_model.zip
--rl_env_load /index_advisor_selector/index_selection/swirl_selection/exp_res/swirl_tpch_v1/vec_normalize.pkl

--work_file /workload_generator/template_based/tpch_work_temp_multi.json

--db_conf_file /configuration_loader/database/db_con.conf

Reference

We sincerely appreciate the authors of the following work for their efforts over the research of index advisors assessed in our work !

[1] Rainer Schlosser, Jan Kossmann, and Martin Boissier. 2019. Efficient Scalable Multi-attribute Index Selection Using Recursive Strategies. In ICDE. 1238–1249.

[2] Gary Valentin, Michael Zuliani, Daniel C. Zilio, Guy M. Lohman, and Alan Skelley. 2000. DB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes. In ICDE. 101–110.

[3] Surajit Chaudhuri and Vivek R. Narasayya. 1997. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB. 146–155.

[4] Kyu-Young Whang. 1987. Index Selection in Relational Databases. Foundations of Data Organization (1987), 487–500.

[5] Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. In SIGMOD Conference. 227–238.

[6] S. Chaudhuri and V. Narasayya. 2020. Anytime Algorithm of Database Tuning Advisor for Microsoft SQL Server. https://www.microsoft.com/en-us/research/publication/.

[7] Alberto Caprara, Matteo Fischetti, and Dario Maio. 1995. Exact and Approximate Algorithms for the Index Selection Problem in Physical Database Design. IEEE Trans. Knowl. Data Eng. 7, 6 (1995), 955–967.

[8] Debabrata Dash, Neoklis Polyzotis, and Anastasia Ailamaki. 2011. CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads. Proc. VLDB Endow. 4, 6 (2011), 362–372.

[9] Jan Kossmann, Alexander Kastius, and Rainer Schlosser. 2022. SWIRL: Selection of Workload-aware Indexes using Reinforcement Learning. In EDBT. 2:155–2:168.

[10] Zahra Sadri, Le Gruenwald, and Eleazar Leal. 2020. DRLindex: deep reinforcement learning index advisor for a cluster database. In IDEAS. 11:1–11:8.

[11] Hai Lan, Zhifeng Bao, and Yuwei Peng. 2020. An Index Advisor Using Deep Reinforcement Learning. In CIKM. 2105–2108.

[12] R. Malinga Perera, Bastian Oetomo, Benjamin I. P. Rubinstein, and Renata Borovica-Gajic. 2021. DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees. In ICDE. 600–611.

[13] Xuanhe Zhou, Luyang Liu, Wenbo Li, Lianyuan Jin, Shifu Li, Tianqing Wang, and Jianhua Feng. 2022. AutoIndex: An Incremental Index Management System for Dynamic Workloads. In ICDE. 2196–2208.

[14] Wentao Wu, Chi Wang, Tarique Siddiqui, Junxiong Wang, Vivek R. Narasayya, Surajit Chaudhuri, and Philip A. Bernstein. 2022. Budget-aware Index Tuning with Reinforcement Learning. In SIGMOD Conference. 1528–1541.

[15] Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2019. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In SIGMOD Conference. 1241–1258.

[16] Tarique Siddiqui, Wentao Wu, Vivek R. Narasayya, and Surajit Chaudhuri. 2022. DISTILL: Low-Overhead Data-Driven Techniques for Filtering and Costing Indexes for Scalable Index Tuning. Proc. VLDB Endow. 15, 10 (2022), 2019–2031.

[17] Jiachen Shi, Gao Cong, and Xiaoli Li. 2022. Learned Index Benefits: Machine Learning Based Index Performance Estimation. Proc. VLDB Endow. 15, 13 (2022), 3950–3962.

[18] Yue Zhao, Gao Cong, Jiachen Shi, and Chunyan Miao. 2022. QueryFormer: A Tree Transformer Model for Query Plan Representation. Proc. VLDB Endow. 15, 8 (2022), 1658–1670.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.9%
  • AMPL 0.1%
0