Ploomber AI Editor | data-visualizer-5d2a

App Description

Create an application that has a text field to enter a URL (default value: https://raw.githubusercontent.com/mwaskom/seaborn-data/master/penguins.csv). The app should read the CSV file (and cache it) from the URL and display a DataFrame. Add controls to do the following: - Control how many rows are displayed - A checkbox to determine to randomly shuffle the DataFrame Users select one or two variables and you must determine the best way to visualize them.

To upload files, please first save the app

Code Editor for app.py

import streamlit as st
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Cache the data loading
@st.cache_data
def load_data(url):
    # NOTE: Using corsproxy.io because we're in a WASM environment. If running locally,
    # you can remove the corsproxy.io prefix.
    if not url.startswith('https://corsproxy.io/?'):
        url = f'https://corsproxy.io/?{url}'
    return pd.read_csv(url)

# App title
st.title('CSV Data Visualizer')

# URL input
url = st.text_input(
    'Enter CSV URL',
    value='https://raw.githubusercontent.com/mwaskom/seaborn-data/master/penguins.csv'
)

try:
    # Load the data
    df = load_data(url)
    
    # Shuffle control
    if st.checkbox('Randomly shuffle data'):
        df = df.sample(frac=1, random_state=42).reset_index(drop=True)
    
    # Number of rows to display
    n_rows = st.slider('Number of rows to display', 1, len(df), min(10, len(df)))
    
    # Display the dataframe
    st.subheader('Data Preview')
    st.dataframe(df.head(n_rows))
    
    # Variable selection
    st.subheader('Variable Selection')
    cols = df.select_dtypes(include=['float64', 'int64']).columns
    var1 = st.selectbox('Select first variable', cols)
    var2 = st.selectbox('Select second variable (optional)', ['None'] + list(cols))
    
    # Visualization
    st.subheader('Visualization')
    
    if var2 == 'None':
        # Single variable visualization
        fig, ax = plt.subplots()
        sns.histplot(data=df, x=var1, kde=True)
        plt.title(f'Distribution of {var1}')
        st.pyplot(fig)
    else:
        # Two variables visualization
        fig, ax = plt.subplots()
        sns.scatterplot(data=df, x=var1, y=var2)
        plt.title(f'{var1} vs {var2}')
        st.pyplot(fig)
        
        # Show correlation
        correlation = df[var1].corr(df[var2])
        st.write(f'Correlation between {var1} and {var2}: {correlation:.2f}')

except Exception as e:
    st.error(f'Error: {str(e)}')

Loading code editor...

Click Save & Run to preview your app

Terminal

Hi! I can help you with any questions about Streamlit and Python. What would you like to know?

app.py