This project involves collecting drug test data from various online sources, transforming it, and applying criteria to identify harmful drug adulterants. The resulting alerts are displayed through a dashboard named RAPID (Real-time Alert Platform for Informed Decisions). RAPID aims to enhance public health by providing real-time alerts about potentially dangerous drug batches. This data-driven approach improves healthcare system preparedness, optimizes resource allocation, supports evidence-based policy development, and offers valuable insights into illicit drug trends for robust research. Data collection and transformation were done using Python, and the dashboard was created with Google Looker Studio.
List of prerequisites necessary for the project:
• Python 3.x
• Jupyter Notebook
• Pandas
• Numpy
• Selenium
Ensure you have the required datasets:
• Dashboard.csv
This code file is designed to scrape and process drug testing data from the websites https://bccsu-drugsense.onrender.com/ and https://getyourdrugstested.com. Below are the steps involved in the script:
-
Import Libraries:
Import Selenium for web scraping, Pandas for data manipulation, and other libraries for handling dates and regular expressions.
-
Initialize Variables:
Set up the current date, the start date for data scraping, the target URL, and the maximum number of pages to scrape.
-
Set Up Selenium WebDriver:
Configure the Selenium WebDriver and navigate to the target website.
-
Data Scraping Loop:
Scrape data from the website by iterating through the pages and extracting relevant information from the table rows. Store the scraped data in a list.
-
Data Processing:
Clean and structure the collected data using various transformation functions to prepare it for analysis.
-
Export Data to CSV:
Merge the scraped data with existing data, sort it by date, and export the processed data to a CSV file named Monthly_Update.csv.
This code file processes drug testing data to generate alerts based on specific criteria. Below are the steps involved in the script:
-
Import Necessary Libraries:
Import libraries such as Pandas and NumPy for data manipulation.
-
Read Data:
Load the data from the Monthly_Update.csv file into a DataFrame.
-
Define Stimulant Conditions and Generate Alerts:
Define conditions for stimulants and apply them to the DataFrame to generate alerts.
-
Define Depressant Conditions and Generate Alerts:
Define conditions for Depressant and apply them to the DataFrame to generate alerts.
-
Merge with Current Dashboard Data:
Merge the processed data with existing data from the Dashboard.csv file to update the dashboard.
-
Export Data to CSV:
Export the updated data to Dashboard.csv, ensuring it is sorted by date and contains no duplicates.
-
Ensure you have the necessary dataset (Dashboard.csv) in the project directory.
-
Open the Jupyter notebook file (Monthly_Update.ipynb) and set the initial dates.
-
Run the code to gather, process the data.
-
Add external data on Monthly_Update.csv file if there is any.
-
Open the Jupyter notebook file (Criteria.ipynb) and Run to apply criteria.
-
The processed data (Dashboard.csv) will have alert status based on the criteria.
-
Create a dashboard using the dataset (Dashboard.csv).
You can view the interactive dashboard here.
List of key features included in the project:
-
Data Collection and Transformation: The project collects drug test data from various online sources and transforms it using Python to ensure it is ready for analysis.
-
Criteria-based Identification: Applies criteria to the collected data to identify harmful drug adulterants, enabling accurate and timely alerts.
-
Interactive Dashboard: The RAPID dashboard, created using Google Looker Studio, provides an interactive and user-friendly interface to visualize alerts and trends.
-
Data-Driven Insights: Offers valuable 54C7 insights into illicit drug trends, supporting robust research and evidence-based policy development.
Details about the data sources used in the project:
For any questions or concerns, please contact:
Name: Seunghyun Park