8000 tabyl sorts the table column by the integer 'name' not the number · Issue #438 · sfirke/janitor · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

tabyl sorts the table column by the integer 8000 'name' not the number #438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
daaronr opened this issue Mar 25, 2021 · 1 comment · Fixed by #439
Closed

tabyl sorts the table column by the integer 'name' not the number #438

daaronr opened this issue Mar 25, 2021 · 1 comment · Fixed by #439
Labels

Comments

@daaronr
Copy link
daaronr commented Mar 25, 2021

Bug

Tabyl (with 2 arguments) sorts the column of the table by the integer 'name' not the number. I want the opposite, of course.


Brief description of the problem

 df1 <- data.frame(var1 = c(1:10),
+                  var2 = c(1:10))

> df1 %>% tabyl(var1,var2)

Yields:

var1 1 10 2 3 4 5 6 7 8 9
    1 1  0 0 0 0 0 0 0 0 0
    2 0  0 1 0 0 0 0 0 0 0
    3 0  0 0 1 0 0 0 0 0 0
    4 0  0 0 0 1 0 0 0 0 0
    5 0  0 0 0 0 1 0 0 0 0
    6 0  0 0 0 0 0 1 0 0 0
    7 0  0 0 0 0 0 0 1 0 0
    8 0  0 0 0 0 0 0 0 1 0
    9 0  0 0 0 0 0 0 0 0 1
   10 0  1 0 0 0 0 0 0 0 0

Like a prison inmate, I'm an integer, sort me by a number, not a name please!

@sfirke
Copy link
Owner
sfirke commented Mar 26, 2021

Agreed this is not the desired behavior, thanks for reporting. And it also extends to the sorting of 3-way tabyls:

library(janitor)
library(dplyr)

data.frame(var1 = 1:10, var2 = 1:10, var3 = 1:10) %>%
  mutate(var2 = ordered(var2, levels = var2)) %>%
  tabyl(var1, var2, var3)

The list of tabyls goes 1, 10, etc.

The fix

We can take advantage of the fact that factors already get sorted correctly, and just make the numerics into factors upstream of that.

For the 2-way issue, I think this could be fixed with by adding something like if(is.numeric(tabl[[2]]) { tabl[[2]] <- ordered(tabl[[2]], levels = tabl[[2]])) } at the top of the block here. Then it would be treated as an ordered factor, taking advantage of that existing code from that point out.

For the 3-way issue, I think that same block could be added https://github.com/sfirke/janitor/blob/master/R/tabyl.R#L232, but referring to dat[[3]].

I marked this "good first issue" and am hopeful someone newer to R development could take a shot at it, especially with these pointers above. It will also require tests to make sure the fix works, I can assist there if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants
0