8000 [Draft] Error Policy for import data with stash-and-continue for ingestion errors by makalaaneesh · Pull Request #2639 · yugabyte/yb-voyager · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Draft] Error Policy for import data with stash-and-continue for ingestion errors #2639

New issue 8000

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

makalaaneesh
Copy link
Collaborator
@makalaaneesh makalaaneesh commented May 20, 2025

Describe the changes in this pull request

https://docs.google.com/document/d/1KqQbIoQqQmlxIHql5Zd8aOyw8bP4j82T7Inw7tYMomg/edit?tab=t.0#heading=h.4p1zl0z1krwh

Overview of change:

  • Introduce error policies.
  • Introduce import data error handlers (one corresponding to each policy that we support)
  • ImportDataStashAndContinueHandler.HandleBatchIngestionError does the following
    • mark batch as errored out (.E suffix)
    • add error message to the batch file
  • All errored batches will remain under import-data-state folder. A new folder under data/errors/<table-name>/<file-name> will contain symlinks to the errored batch files. This is to give a single view only of the errored files. In future this folder will also contain the processing errors.

Overall Task list:

  • basic framework
    • base interfaces
    • errors during ingestion (when ingesting batch to YB)
    • errors during processing (reading and transforming rows before batch creation )
    • CLI and config flags
  • errors during ingestion
    • basic abort policy
      • basic test for regression
      • tests
    • stash and continue policy
      • mark batch as error
      • resumption
      • proper file structure with symlinks to batch files
      • add error message to batch file
      • move error message to the beginning of the batch file
      • tests
        • basic test on importBatch - error is not thrown.
        • multiple batches - error in single batch (first, middle, last), resumption with batch error,
        • multiple tasks - task completed with errors should not be re-picked
        • error message included in batch.
  • errors during processing
    • basic abort policy
    • stash and continue policy
    • large rows?
    • csv errors? (csvmaxreaderbuffer?)
  • status commands
    • import data status
    • get data-migration -report
  • end-migration
    • back up error files as well
  • integrate with -on-primary-key-conflit
    • change flag option from ERROR -> ERROR_POLICY
    • handle partial errors

Describe if there are any user-facing changes

How was this pull request tested?

Does your PR have changes in callhome/yugabyted payloads? If so, is the payload version incremented?

Does your PR have changes that can cause upgrade issues?

Component Breaking changes?
MetaDB Yes/No
Name registry json Yes/No
Data File Descriptor Json Yes/No
Export Snapshot Status Json Yes/No
Import Data State Yes/No
Export Status Json Yes/No
Data .sql files of tables Yes/No
Export and import data queue Yes/No
Schema Dump Yes/No
AssessmentDB Yes/No
Sizing DB Yes/No
Migration Assessment Report Json Yes/No
Callhome Json Yes/No
YugabyteD Tables Yes/No
TargetDB Metadata Tables Yes/No

@@ -537,6 +540,12 @@ func importData(importFileTasks []*ImportFileTask) {
}
}

exportDirDataDir := filepath.Join(exportDir, "data")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: move this to a common func

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In import data file, data dir is a configurable argument so its not inside export-dir

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, but that's not a concern, right? we always create a data/ folder. in import-data-file, it will be empty, and that's okay?

limitations under the License.
*/

package errorpolicy
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably an overkill of a package. Thinking about this in a follow-up PR.
Perhaps just put it inside constants?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is not a heavy component that has a lot of functionality, as the actual functionality is in Import handlers, so it doesn't feel like it should be a separate package. I think some constants and functions in utils should suffice for this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, I was thinking of two options:

  • put it in constants package. (or create a new enums package)
  • put it in importdata package.

Going with importdata package as it's only going to be used in that package.

Copy link
Contributor
@priyanshi-yb priyanshi-yb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't reviewed it thoroughly, some comments on overall structure and flow.

return &ImportDataAbortHandler{}
}

func (handler *ImportDataAbortHandler) ShouldAbort() bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why will we have a check Abort in all handlers, that's the type of Handler right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed,

  • The alternative would be to type assert to the concrete type before we can check. ShouldAbort() feels slightly cleaner and it also makes sense to be part of the handler interface.
  • Theoretically, you could have more than one handlers wherein we might need to abort (for example, a SendSlackNotificationAndAbort handler ). Therefore, a func ShouldAbort() makes sense.

if err2 != nil {
utils.ErrExit("handling error for batch: %q into %s: %s", batch.FilePath, batch.TableNameTup, err2)
}
if fti.errorHandler.ShouldAbort() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are having this function to check if it's a AbortHandler, should we check this first thing here and not do any handling etc..?

Copy link
Collaborator Author
@makalaaneesh makalaaneesh May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although theoretically we can have handlers where we might need to handle and then abort, I'll simplify this flow for now.
Makes sense to handle only if we are not aborting. I will clean this up

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

< 67ED input type="hidden" name="authenticity_token" value="qwQl5l3CrEA08SLfb4EZ0JKk6ZH3RQ-R-ol-Sbvn5qcYqxRzOT8LEWVO0Sf9-rZONvL5JSH4yOMGnTwkA4SaAg" autocomplete="off" />

done ^

limitations under the License.
*/

package errorpolicy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is not a heavy component that has a lot of functionality, as the actual functionality is in Import handlers, so it doesn't feel like it should be a separate package. I think some constants and functions in utils should suffice for this.

for _, batch := range batches {
if batch.IsDone() {
doneCount++
} else if batch.IsInterrupted() {
interruptedCount++
} else if batch.IsErrored() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if someone fixes the batch and it gets ingested properly, will it then change the status, etc, for that scenario?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the workflow for dealing with stashed errors. That is out of scope for this PR (and overall task)

@@ -537,6 +540,12 @@ func importData(importFileTasks []*ImportFileTask) {
}
}

exportDirDataDir := filepath.Join(exportDir, "data")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In import data file, data dir is a configurable argument so its not inside export-dir

}

func (handler *ImportDataStashAndContinueHandler) getTaskErrorsFolderPath(batch ErroredBatch, importFileTaskId string) string {
return filepath.Join(handler.dataDir, "errors", batch.GetTableName().ForMinOutput(), importFileTaskId)
Copy link
Contributor
@priyanshi-yb priyanshi-yb May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't use the data dir for this as datadir is user's directory in case of import data file (local option) and is S3/GCS/etc.. buckets in cloud option.So, we should use import data state which is voyager's metadir.

Copy link
Collaborator Author
@makalaaneesh makalaaneesh May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this, but

  • the metainfo directory is an internal directory for voyager to maintain some state and metadata. i don't think it's a good idea to expose that to the user. We should be free to change that as we please. (i know we already do expose that, but only when there is an error).
  • It's cleaner if the directory that we point the user to for the errors, is dedicated and at the top-level (data/errors as opposed to under /metainfo/import_data_state/target_db_importer/ which seems pretty internal.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you see any concerns if we put it under data/errors ? As i mentioned in the other comment, in case of import-data-file, that directory will be empty, but i don't see that necessarily as a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0