You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some data sets, like old Amazon ratings, include fully-duplicate entries. These can be supported as a stepping stone to #770.
The DataSetBuilder.add_interactions method should support de-duplicating interactions when they are added to the dataset.
Proposed interface:
Deprecate the allow_repeats option (to both add_interactions and add_relationships, and the corresponding relationship class method).
Add a new option repeats, with several options: allow, forbid, remove, and remove-duplicates.
The options work as follows:
allow allows repeated relationship records, including full duplicates. The schema repeat field is set to either ALLOWED or PRESENT, as appropriate.
forbid forbids repeated relationship records, raising an error if they are present. The schema's repeat field is set to FORBIDDEN.
remove removes repeated relationship records (they have the same set of entity IDs). The schema's repeat field is set to FORBIDDEN.
remove-duplicates removes duplicate relationship records (they have the same entity IDs and fields — the rows in the input table are fully duplicated). The schema's repeat field is set to ALLOWED or PRESENT, as appropriate.
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
Some data sets, like old Amazon ratings, include fully-duplicate entries. These can be supported as a stepping stone to #770.
The
DataSetBuilder.add_interactions
method should support de-duplicating interactions when they are added to the dataset.Proposed interface:
allow_repeats
option (to bothadd_interactions
andadd_relationships
, and the corresponding relationship class method).repeats
, with several options:allow
,forbid
,remove
, andremove-duplicates
.The options work as follows:
allow
allows repeated relationship records, including full duplicates. The schema repeat field is set to eitherALLOWED
orPRESENT
, as appropriate.forbid
forbids repeated relationship records, raising an error if they are present. The schema's repeat field is set toFORBIDDEN
.remove
removes repeated relationship records (they have the same set of entity IDs). The schema's repeat field is set toFORBIDDEN
.remove-duplicates
removes duplicate relationship records (they have the same entity IDs and fields — the rows in the input table are fully duplicated). The schema's repeat field is set toALLOWED
orPRESENT
, as appropriate.The text was updated successfully, but these errors were encountered: