8000 [Enhancement] Add more informations for information_schema.task_runs by LiShuMing · Pull Request #60054 · StarRocks/starrocks · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Enhancement] Add more informations for information_schema.task_runs #60054

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

LiShuMing
Copy link
Contributor
@LiShuMing LiShuMing commented Jun 19, 2025

Why I'm doing:

  1. We cannot figure out a task run's pending time which's the time between a task run creating and starting to process from information_schema.task_runs or information_schema.materialized_views;
  2. For mv refresh tasks, one mv refresh may trigger multi task tasks which each one process only one to-refresh partition, but we cannot distinguish which task runs are belonged to the same job (the mv refresh task).

so I added more informations about those two situations.

What I'm doing:

This pull request introduces enhancements to the schema scanner and materialized views system, focusing on adding new fields, improving data type consistency, and refining the handling of materialized view refresh statuses. The changes span multiple files and primarily affect the backend schema scanner, frontend catalog system, and materialized view status representation.

Backend Schema Scanner Enhancements:

  • Added new columns LAST_REFRESH_PROCESS_TIME and LAST_REFRESH_JOB_ID to the SchemaMaterializedViewsScanner and updated the fill_chunk method to include these fields. [1] [2]
  • Added new columns JOB_ID, JOB_STATE, and PROCESS_TIME to the SchemaTaskRunsScanner, with corresponding updates to the fill_chunk method to handle these fields. [1] [2]

Frontend Catalog System Updates:

  • Updated MaterializedViewsSystemTable to include new columns (LAST_REFRESH_PROCESS_TIME, LAST_REFRESH_JOB_ID, etc.) and changed data types for several existing columns (e.g., MATERIALIZED_VIEW_ID to BIGINT, LAST_REFRESH_DURATION to DOUBLE). [1] [2]
  • Enhanced TaskRunsSystemTable by adding JOB_ID, JOB_STATE, and PROCESS_TIME columns, along with corresponding data type adjustments. [1] [2]

Materialized View Status Improvements:

  • Added fields jobId and mvRefreshProcessTime to ShowMaterializedViewStatus to track job-specific and process-related information. [1] [2] [3]
  • Updated methods in ShowMaterializedViewStatus to populate and display the new fields (jobId, mvRefreshProcessTime) in both thrift and result set representations. [1] [2] [3]

Data Type Consistency:

  • Standardized data types in ShowMaterializedViewsStmt and MaterializedViewsSystemTable, converting several fields (e.g., id, task_id, rows) to BIGINT and others (e.g., last_refresh_duration) to more appropriate types like DOUBLE or DATETIME. [1] [2]

TaskRun Status Refinement:

  • Enhanced TaskRunStatus to better handle refresh states for materialized views, including distinguishing between running and finished states. [1] [2]

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.5
    • 3.4
    • 3.3

@LiShuMing LiShuMing requested review from a team as code owners June 19, 2025 04:44
@LiShuMing LiShuMing changed the title [Enhancement] Add more infomations for information_schema.task_runs [Enhancement] Add more informations for information_schema.task_runs Jun 19, 2025
@wanpengfei-git wanpengfei-git requested a review from a team June 19, 2025 04:45
// process finish time
addField(resultRow, TimeUtils.longToTimeString(refreshJobStatus.getMvRefreshEndTime()));
// process duration
addField(resultRow, formatDuration(refreshJobStatus.getTotalProcessDuration()));
// last refresh job id
addField(resultRow, refreshJobStatus.getJobId());
// last refresh state
addField(resultRow, refreshJobStatus.getRefreshState());
// whether it's force refresh
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
Assigning mvRefreshStartTime to mvRefreshProcessTime, potentially causing misleading timestamps.

You can modify the code like this:

// start time
long mvRefreshCreateTime = firstTaskRunStatus.getCreateTime();
status.setMvRefreshStartTime(mvRefreshCreateTime);

// process time - assumed it should be a different timestamp, correcting that
long mvRefreshProcessTime = firstTaskRunStatus.getProcessStartTime();
status.setMvRefreshProcessTime(mvRefreshProcessTime);

Ensure that getCreateTime() and getProcessStartTime() return the expected timestamps for their respective usage.

@github-actions github-actions bot added the 3.5 label Jun 19, 2025
@alvin-celerdata
Copy link
Contributor

@LiShuMing, this will change user interface, please don't cherry-pick to v3.5

@LiShuMing LiShuMing force-pushed the enhance_task_run_display branch from a643e7d to 91006b8 Compare June 19, 2025 15:20
@wanpengfei-git wanpengfei-git requested a review from a team June 19, 2025 15:20
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
@LiShuMing LiShuMing force-pushed the enhance_task_run_display branch from 91006b8 to b4cef8d Compare June 20, 2025 10:17
Signed-off-by: shuming.li <ming.moriarty@gmail.com>
@LiShuMing LiShuMing requested a review from a team as a code owner June 20, 2025 14:01
@LiShuMing LiShuMing force-pushed the enhance_task_run_display branch from 34f1fdb to fdff3f7 Compare June 21, 2025 00:57
Copy link

Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[FE Incremental Coverage Report]

fail : 121 / 155 (78.06%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/qe/ShowExecutor.java 1 3 33.33% [749, 750]
🔵 com/starrocks/qe/ShowMaterializedViewStatus.java 75 106 70.75% [332, 373, 374, 375, 775, 778, 779, 780, 781, 783, 784, 785, 786, 787, 788, 790, 791, 792, 793, 794, 795, 803, 804, 805, 806, 807, 816, 817, 818, 819, 820]
🔵 com/starrocks/scheduler/persist/TaskRunStatus.java 15 16 93.75% [162]
🔵 com/starrocks/sql/ast/ShowMaterializedViewsStmt.java 8 8 100.00% []
🔵 com/starrocks/sql/plan/PlanFragmentBuilder.java 2 2 100.00% []
🔵 com/starrocks/catalog/system/information/MaterializedViewsSystemTable.java 15 15 100.00% []
🔵 com/starrocks/catalog/system/information/TaskRunsSystemTable.java 4 4 100.00% []
🔵 com/starrocks/scheduler/TaskRun.java 1 1 100.00% []

Copy link

[BE Incremental Coverage Report]

fail : 1 / 22 (04.55%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/exec/schema_scanner/schema_materialized_views_scanner.cpp 0 2 00.00% [112, 115]
🔵 be/src/exec/schema_scanner/schema_task_runs_scanner.cpp 1 20 05.00% [277, 279, 280, 281, 282, 283, 284, 286, 288, 289, 290, 291, 292, 293, 294, 296, 297, 298, 301]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0