10000 handle no datum id by rguo123 · Pull Request #380 · nomic-ai/nomic · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

handle no datum id #380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

handle no datum id #380

wants to merge 1 commit into from

Conversation

rguo123
Copy link
Collaborator
@rguo123 rguo123 commented Mar 14, 2025

Important

Adds exception handling and fallback logic for missing 'datum_id' in nomic/data_operations.py and updates version in setup.py.

  • Behavior:
    • Adds exception handling in _load_duplicates(), _load_topics(), tb(), df(), get_datums_in_tag(), and _load_data() in nomic/data_operations.py to handle missing .datum_id.feather files.
    • Introduces fallback logic to use alternative sidecar files when .datum_id.feather is missing.
  • Setup:
    • Updates version in setup.py from 3.4.1 to 3.4.2.

This description was created by Ellipsis for affdc46. It will automatically update as commits are pushed.

Copy link
Collaborator Author
rguo123 commented Mar 14, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@rguo123 rguo123 marked this pull request as ready for review March 14, 2025 01:41
@rguo123 rguo123 force-pushed the 03-13-handle_no_datum_id branch from b847ff1 to affdc46 Compare March 14, 2025 01:42
tb = feather.read_table(
self.projection.tile_destination / Path(key).with_suffix(".datum_id.feather"), memory_map=True
)
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRY: Repeated try/except blocks for reading datum id feather files. Consider extracting a helper to avoid duplicating fallback logic.

@@ -675,7 +721,7 @@ def __init__(self, projection: "AtlasProjection", auto_cleanup: Optional[bool] =
try:
self.projection._download_sidecar("datum_id")
except Exception:
raise ValueError("Failed to fetch datum ids which is required to load tags.")
id_sidecar = self.projection._get_sidecar_from_field(self.id_field)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review fallback behavior in AtlasMapTags.__init__: In the exception block, only id_sidecar is assigned without using it to download the sidecar. Should this mirror the approach used elsewhere?

@rguo123 rguo123 marked this pull request as draft April 24, 2025 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0