8000 Without DataFrames · Issue #1 · technocrat/ACS.jl · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Without DataFrames #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clickin 8000 g “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jariji opened this issue Apr 1, 2025 · 8 comments
Open

Without DataFrames #1

jariji opened this issue Apr 1, 2025 · 8 comments

Comments

@jariji
Copy link
jariji commented Apr 1, 2025

I'd like an option to use this without DataFrames.jl, using other Tables.jl table types like StructArrays.jl or Vector{NamedTuple} or NamedTuple{Vector}. For example, get_acs(StructArray, ...).

Then I wouldn't have to depend on or import DataFrames.jl and could directly use whatever table I want instead of needing to convert to it. DataFrames.jl could be supported in an extension package but wouldn't need to be a direct dependency of ACS.jl.

@technocrat
Copy link
Owner

I'll take a look.

@jariji
Copy link
Author
jariji commented Apr 1, 2025

Just for a point of comparison, CSV.jl depends on the generic Tables.jl interface and lets users use whichever table structure they need.

@technocrat
Copy link
Owner

Like this?
`=== DataFrame Output ===
52×4 DataFrame
Row │ B01003_001E B19013_001E NAME state
│ String String String String
─────┼────────────────────────────────────────────────────────
1 │ 5028092 59609 Alabama 01
2 │ 734821 86370 Alaska 02
3 │ 7172282 72581 Arizona 04
4 │ 3018669 56335 Arkansas 05
5 │ 39356104 91905 California 06
6 │ 5770790 87598 Colorado 08
7 │ 3611317 90213 Connecticut 09
8 │ 993635 79325 Delaware 10
9 │ 670587 101722 District of Columbia 11
10 │ 21634529 67917 Florida 12
11 │ 10722325 71355 Georgia 13
12 │ 1450589 94814 Hawaii 15
13 │ 1854109 70214 Idaho 16
14 │ 12757634 78433 Illinois 17
15 │ 6784403 67173 Indiana 18
16 │ 3188836 70571 Iowa 19
17 │ 2935922 69747 Kansas 20
18 │ 4502935 60183 Kentucky 21
19 │ 4640546 57852 Louisiana 22
20 │ 1366949 68251 Maine 23
21 │ 6161707 98461 Maryland 24
22 │ 6984205 96505 Massachusetts 25
23 │ 10057921 68505 Michigan 26
24 │ 5695292 84313 Minnesota 27
25 │ 2958846 52985 Mississippi 28
26 │ 6154422 65920 Missouri 29
27 │ 1091840 66341 Montana 30
28 │ 1958939 71722 Nebraska 31
29 │ 3104817 71646 Nevada 32
30 │ 1379610 90845 New Hampshire 33
31 │ 9249063 97126 New Jersey 34
32 │ 2112463 58722 New Mexico 35
33 │ 19994379 81386 New York 36
34 │ 10470214 66186 North Carolina 37
35 │ 776874 73959 North Dakota 38
36 │ 11774683 66990 Ohio 39
37 │ 3970497 61364 Oklahoma 40
38 │ 4229374 76632 Oregon 41
39 │ 12989208 73170 Pennsylvania 42
40 │ 1094250 81370 Rhode Island 44
41 │ 5142750 63623 South Carolina 45
42 │ 890342 69457 South Dakota 46
43 │ 6923772 64035 Tennessee 47
44 │ 29243342 73035 Texas 48
45 │ 3283809 86833 Utah 49
46 │ 643816 74014 Vermont 50
47 │ 8624511 87249 Virginia 51
48 │ 7688549 90325 Washington 53
49 │ 1792967 55217 West Virginia 54
50 │ 5882128 72458 Wisconsin 55
51 │ 577929 72495 Wyoming 56
52 │ 3272382 24002 Puerto Rico 72

=== StructArray Output ===
[:B19013_001E => ["59609", "86370", "72581", "56335", "91905", "87598", "90213", "79325", "101722", "67917", "71355", "94814", "70214", "78433", "67173", "70571", "69747", "60183", "57852", "68251", "98461", "96505", "68505", "84313", "52985", "65920", "66341", "71722", "71646", "90845", "97126", "58722", "81386", "66186", "73959", "66990", "61364", "76632", "73170", "81370", "63623", "69457", "64035", "73035", "86833", "74014", "87249", "90325", "55217", "72458", "72495", "24002"], :state => ["01", "02", "04", "05", "06", "08", "09", "10", "11", "12", "13", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "44", "45", "46", "47", "48", "49", "50", "51", "53", "54", "55", "56", "72"], :B01003_001E => ["5028092", "734821", "7172282", "3018669", "39356104", "5770790", "3611317", "993635", "670587", "21634529", "10722325", "1450589", "1854109", "12757634", "6784403", "3188836", "2935922", "4502935", "4640546", "1366949", "6161707", "6984205", "10057921", "5695292", "2958846", "6154422", "1091840", "1958939", "3104817", "1379610", "9249063", "2112463", "19994379", "10470214", "776874", "11774683", "3970497", "4229374", "12989208", "1094250", "5142750", "890342", "6923772", "29243342", "3283809", "643816", "8624511", "7688549", "1792967", "5882128", "577929", "3272382"], :NAME => ["Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "Delaware", "District of Columbia", "Florida", "Georgia", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas", "Kentucky", "Louisiana", "Maine", "Maryland", "Massachusetts", "Michigan", "Minnesota", "Mississippi", "Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire", "New Jersey", "New Mexico", "New York", "North Carolina", "North Dakota", "Ohio", "Oklahoma", "Oregon", "Pennsylvania", "Rhode Island", "South Carolina", "South Dakota", "Tennessee", "Texas", "Utah", "Vermont", "Virginia", "Washington", "West Virginia", "Wisconsin", "Wyoming", "Puerto Rico"]]

=== NamedTuples Output ===
NamedTuple{(:NAME, :B01003_001E, :B19013_001E, :state), NTuple{4, String}}[(NAME = "Alabama", B01003_001E = "5028092", B19013_001E = "59609", state = "01"), (NAME = "Alaska", B01003_001E = "734821", B19013_001E = "86370", state = "02"), (NAME = "Arizona", B01003_001E = "7172282", B19013_001E = "72581", state = "04"), (NAME = "Arkansas", B01003_001E = "3018669", B19013_001E = "56335", state = "05"), (NAME = "California", B01003_001E = "39356104", B19013_001E = "91905", state = "06"), (NAME = "Colorado", B01003_001E = "5770790", B19013_001E = "87598", state = "08"), (NAME = "Connecticut", B01003_001E = "3611317", B19013_001E = "90213", state = "09"), (NAME = "Delaware", B01003_001E = "993635", B19013_001E = "79325", state = "10"), (NAME = "District of Columbia", B01003_001E = "670587", B19013_001E = "101722", state = "11"), (NAME = "Florida", B01003_001E = "21634529", B19013_001E = "67917", state = "12"), (NAME = "Georgia", B01003_001E = "10722325", B19013_001E = "71355", state = "13"), (NAME = "Hawaii", B01003_001E = "1450589", B19013_001E = "94814", state = "15"), (NAME = "Idaho", B01003_001E = "1854109", B19013_001E = "70214", state = "16"), (NAME = "Illinois", B01003_001E = "12757634", B19013_001E = "78433", state = "17"), (NAME = "Indiana", B01003_001E = "6784403", B19013_001E = "67173", state = "18"), (NAME = "Iowa", B01003_001E = "3188836", B19013_001E = "70571", state = "19"), (NAME = "Kansas", B01003_001E = "2935922", B19013_001E = "69747", state = "20"), (NAME = "Kentucky", B01003_001E = "4502935", B19013_001E = "60183", state = "21"), (NAME = "Louisiana", B01003_001E = "4640546", B19013_001E = "57852", state = "22"), (NAME = "Maine", B01003_001E = "1366949", B19013_001E = "68251", state = "23"), (NAME = "Maryland", B01003_001E = "6161707", B19013_001E = "98461", state = "24"), (NAME = "Massachusetts", B01003_001E = "6984205", B19013_001E = "96505", state = "25"), (NAME = "Michigan", B01003_001E = "10057921", B19013_001E = "68505", state = "26"), (NAME = "Minnesota", B01003_001E = "5695292", B19013_001E = "84313", state = "27"), (NAME = "Mississippi", B01003_001E = "2958846", B19013_001E = "52985", state = "28"), (NAME = "Missouri", B01003_001E = "6154422", B19013_001E = "65920", state = "29"), (NAME = "Montana", B01003_001E = "1091840", B19013_001E = "66341", state = "30"), (NAME = "Nebraska", B01003_001E = "1958939", B19013_001E = "71722", state = "31"), (NAME = "Nevada", B01003_001E = "3104817", B19013_001E = "71646", state = "32"), (NAME = "New Hampshire", B01003_001E = "1379610", B19013_001E = "90845", state = "33"), (NAME = "New Jersey", B01003_001E = "9249063", B19013_001E = "97126", state = "34"), (NAME = "New Mexico", B01003_001E = "2112463", B19013_001E = "58722", state = "35"), (NAME = "New York", B01003_001E = "19994379", B19013_001E = "81386", state = "36"), (NAME = "North Carolina", B01003_001E = "10470214", B19013_001E = "66186", state = "37"), (NAME = "North Dakota", B01003_001E = "776874", B19013_001E = "73959", state = "38"), (NAME = "Ohio", B01003_001E = "11774683", B19013_001E = "66990", state = "39"), (NAME = "Oklahoma", B01003_001E = "3970497", B19013_001E = "61364", state = "40"), (NAME = "Oregon", B01003_001E = "4229374", B19013_001E = "76632", state = "41"), (NAME = "Pennsylvania", B01003_001E = "12989208", B19013_001E = "73170", state = "42"), (NAME = "Rhode Island", B01003_001E = "1094250", B19013_001E = "81370", state = "44"), (NAME = "South Carolina", B01003_001E = "5142750", B19013_001E = "63623", state = "45"), (NAME = "South Dakota", B01003_001E = "890342", B19013_001E = "69457", state = "46"), (NAME = "Tennessee", B01003_001E = "6923772", B19013_001E = "64035", state = "47"), (NAME = "Texas", B01003_001E = "29243342", B19013_001E = "73035", state = "48"), (NAME = "Utah", B01003_001E = "3283809", B19013_001E = "86833", state = "49"), (NAME = "Vermont", B01003_001E = "643816", B19013_001E = "74014", state = "50"), (NAME = "Virginia", B01003_001E = "8624511", B19013_001E = "87249", state = "51"), (NAME = "Washington", B01003_001E = "7688549", B19013_001E = "90325", state = "53"), (NAME = "West Virginia", B01003_001E = "1792967", B19013_001E = "55217", state = "54"), (NAME = "Wisconsin", B01003_001E = "5882128", B19013_001E = "72458", state = "55"), (NAME = "Wyoming", B01003_001E = "577929", B19013_001E = "72495", state = "56"), (NAME = "Puerto Rico", B01003_001E = "3272382", B19013_001E = "24002", state = "72")]

=== Columnar Output ===
(B19013_001E = ["59609", "86370", "72581", "56335", "91905", "87598", "90213", "79325", "101722", "67917", "71355", "94814", "70214", "78433", "67173", "70571", "69747", "60183", "57852", "68251", "98461", "96505", "68505", "84313", "52985", "65920", "66341", "71722", "71646", "90845", "97126", "58722", "81386", "66186", "73959", "66990", "61364", "76632", "73170", "81370", "63623", "69457", "64035", "73035", "86833", "74014", "87249", "90325", "55217", "72458", "72495", "24002"], state = ["01", "02", "04", "05", "06", "08", "09", "10", "11", "12", "13", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "44", "45", "46", "47", "48", "49", "50", "51", "53", "54", "55", "56", "72"], B01003_001E = ["5028092", "734821", "7172282", "3018669", "39356104", "5770790", "3611317", "993635", "670587", "21634529", "10722325", "1450589", "1854109", "12757634", "6784403", "3188836", "2935922", "4502935", "4640546", "1366949", "6161707", "6984205", "10057921", "5695292", "2958846", "6154422", "1091840", "1958939", "3104817", "1379610", "9249063", "2112463", "19994379", "10470214", "776874", "11774683", "3970497", "4229374", "12989208", "1094250", "5142750", "890342", "6923772", "29243342", "3283809", "643816", "8624511", "7688549", "1792967", "5882128", "577929", "3272382"], NAME = ["Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "Delaware", "District of Columbia", "Florida", "Georgia", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas", "Kentucky", "Louisiana", "Maine", "Maryland", "Massachusetts", "Michigan", "Minnesota", "Mississippi", "Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire", "New Jersey", "New Mexico", "New York", "North Carolina", "North Dakota", "Ohio", "Oklahoma", "Oregon", "Pennsylvania", "Rhode Island", "South Carolina", "South Dakota", "Tennessee", "Texas", "Utah", "Vermont", "Virginia", "Washington", "West Virginia", "Wisconsin", "Wyoming", "Puerto Rico"])
(base) ro@Richards-MacBook-Pro ACS.jl % `

@jariji
Copy link
Author
jariji commented Apr 1, 2025

Yeah.

@jariji
Copy link
Author
jariji commented Apr 1, 2025

get_acs(Type, args...) would take a table type (DataFrame, StructArray, etc) and instantiate that type with the table.

@technocrat
Copy link
Owner

I'm thinking like this
get_acs5(; variables::Vector{String}, geography::String, year::Int = 2022, state::Union{String,Nothing} = nothing, county::Union{String,Nothing} = nothing, output_type::Symbol = :dataframe )

@jariji
Copy link
Author
jariji commented Apr 2, 2025

My leaning would be for positional argument rather than keyword and a type rather than a symbol, so it's an extensible interface, rather than an internal pattern match.

@technocrat
Copy link
Owner
technocrat commented Apr 2, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0