8000 Introduce new options for estimating table sizes by hanefi · Pull Request #793 · dimitri/pgcopydb · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Introduce new options for estimating table sizes #793

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 29, 2024

Conversation

hanefi
Copy link
Contributor
@hanefi hanefi commented May 29, 2024

This commit introduces a new option --estimate-table-sizes. When set, pgcopydb will use the relpages value in system catalog to estimate the size of each table. This is done by multiplying the number of pages by the page size. PostgreSQL uses a default of 8KB for page sizes, but can be changed in build time.

Optionally, the user can set the environment variable PGCOPYDB_ESTIMATE_TABLE_SIZES to a boolean value to enable the option.

If this option is used, we run vacuumdb --analyze-only on the source to update the relpages values before calculating the estimates.

In passing, I fixed several things including the following:

  • Fix comment references to pg_autoctl
  • Remove some unused functions
  • Remove reference to non-existing vacuumdb command
  • Remove --cache and --drop-cache references that are no longer used
  • Fix typos in vacuum logs and comments
  • Remove references to pgcopydb_table_size table

Closes #775

hanefi added 2 commits May 29, 2024 18:47
This commit introduces a new option --estimate-table-sizes. When set,
pgcopydb will use the relpages value in system catalog to estimate the
size of each table. This is done by multiplying the number of pages by
the page size. PostgreSQL uses a default of 8KB for page sizes, but can
be changed in build time.

Optionally, the user can set the environment variable
PGCOPYDB_ESTIMATE_TABLE_SIZES to a boolean value to enable the option.

If this option is used, we run vacuumdb --analyze-only on the source to
update the relpages values before calculating the estimates.

In passing, I fixed several things including the following:
- Fix comment references to pg_autoctl
- Remove some unused functions
- Remove reference to non-existing vacuumdb command
- Remove --cache and --drop-cache references that are no longer used
- Fix typos in vacuum logs and comments
- Remove references to pgcopydb_table_size table
@dimitri dimitri added the enhancement New feature or request label May 29, 2024
@dimitri dimitri added this to the v0.17 milestone May 29, 2024
@dimitri dimitri self-requested a review May 29, 2024 15:55
Copy link
Owner
@dimitri dimitri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor change needed, and documentation coverage of the new option.

@@ -729,6 +729,16 @@ PGCOPYDB_SPLIT_TABLES_LARGER_THAN
When ``--split-tables-larger-than`` is ommitted from the command line,
then this environment variable is used.

PGCOPYDB_ESTIMATE_TABLE_SIZES
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actual documentation coverage for the option is missing, see the option list at https://pgcopydb.readthedocs.io/en/latest/ref/pgcopydb_clone.html#options. In particular, we need to document the vacuumdb call that is implemented when using that option, as that's not transparent to the user at all.

Comment on lines 4888 to 4889
int fnbytesestimate = PQfnumber(result, "bytesestimate");
int fnbytesestimatepp = PQfnumber(result, "pg_size_pretty");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking: could we place these in the SQL query column list order?

@hanefi hanefi requested a review from dimitri May 29, 2024 16:29
@dimitri dimitri merged commit b0ce438 into dimitri:main May 29, 2024
18 checks passed
@hanefi hanefi deleted the estimate-table-sizes branch June 6, 2024 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0