8000 Fix: Remove empty documents in Create Corpus widget (#1104) by leskovecg · Pull Request #1111 · biolab/orange3-text · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix: Remove empty documents in Create Corpus widget (#1104) #1111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

leskovecg
Copy link
Collaborator

Issue

Closes #1104

Create Corpus previously allowed sending out documents with empty text, which could cause issues downstream. This fix ensures that documents with empty text fields are automatically filtered out when creating the Corpus.

Description of changes

  • In the commit method, added a filter to remove documents with empty text before creating the Corpus object.
  • If all documents are empty, the widget outputs None.

Includes

  • Code changes
  • Tests
  • Documentation

Tested manually – works as expected

@janezd janezd marked this pull request as draft April 6, 2025 15:05
@leskovecg leskovecg closed this May 15, 2025
@leskovecg leskovecg deleted the fix-empty-documents branch May 15, 2025 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create Corpus: do not send out empty data
1 participant
0