Expose similarity matching item pairs in Python library (aka crosswalk table)

Task make a function called generate_crosswalk_table(all_questions, similarity, threshold) which takes the output of match_instruments and gives the pairs that match above a threshold.

Description

The web UI allows users to see the matching item pairs above a given threshold

Can we make the Python library also return the matching pairs above a threshold? This is called the crosswalk table

A crosswalk table is the same information as is currently coming back in the similarity matrix but just in a different format

It is a long-format data frame that shows each matching pair of questions above a certain threshold, along with their respective IDs, question texts, and match scores. Here's an example structure:

# Example structure of crosswalk table DataFrame:

# tibble [n × 6]

# $ pair_name      : chr  # Name of the survey pair

# $ question1_no   : chr  # ID of question from first survey

# $ question1_text : chr  # Text of question from first survey

# $ question2_no   : chr  # ID of question from second survey

# $ question2_text : chr  # Text of question from second survey

# $ match_score    : num  # Similarity score between the questions

See also equivalent issue in R: harmonydata/harmony_r#4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions