ALL BUSINESS DATA ANALYSIS ENGLISH ARTICLES MICROSOFT EXCEL TIPS PYTHON

Use of chatGPT and YouTube to become a Data Analyst

Use of chatGPT and YouTube to become a Data Analyst – Χρήση του chatGPT και του YouTube για να κάνει κάποιος ανάλυση δεδομένων

Code file: app.py

Folder: C:\PythonPrograms\data_analysis_with_chatGPT

 

1) Merge multiple excel files – Συγχώνευση πολλών αρχείων excel

data1.xlsx

data2.xlsx

data3.xlsx

data4.xlsx

 

2) Clean data – Καθαρισμός δεδομένων

 

3) Create two excel reports (Total Revenue, Expenses and Profit by Category) and (Total Revenue, Expenses and Profit by Country) – Δημιουργία δύο αρχείων excel (category_report.xlsx and country_report.xlsx)

 

4) Add a chart to category_report.xlsx – Προσθήκη γραφήματος στο αρχείο excel category_report.xlsx

Category report

5) Add a chart to country_report.xlsx – Προσθήκη γραφήματος στο αρχείο excel country_report.xlsx

6) Build two interactive plots (category_chart.html and country_chart.html)

Category report htmlCountry report html

7) Create a streamlit dashboard with 2 graphs – Δημιουργία ενός πίνακα προβολής και διαχείρισης streamlit με 2 γραφήματα

Streamlit presentation

με τη βοήθεια της γλώσσας Python – with Python.

 

Changed code by Trifonas Papadopoulos:

#Code starts here—————————

import xlsxwriter

import pandas as pd

import os

import plotly.express as px

import streamlit as st

 

# Define the path to the data folder

data_folder = ‘data’

 

# Get a list of all the Excel files in the data folder

excel_files = [f for f in os.listdir(data_folder) if f.endswith(‘.xlsx’)]

 

# Create an empty dataframe to hold the combined data

combined_data = pd.DataFrame()

 

# Loop through each Excel file and append its data to the combined_data dataframe

for file in excel_files:

# Load the Excel file into a dataframe

df = pd.read_excel(os.path.join(data_folder, file))

 

# Add a column to the dataframe to indicate the source file

df[‘Source’] = file

 

# Append the data to the combined_data dataframe

combined_data = pd.concat([combined_data, df], ignore_index=True)

 

# Drop duplicate rows

combined_data.drop_duplicates(inplace=True)

 

# Handle missing values

combined_data.dropna(inplace=True)

 

# Print the combined data

print(combined_data)

 

# Category Statistics

 

# Calculate total revenue, expenses, and profit for each category

category_data = combined_data.groupby(‘Category’).agg({‘Revenue’: ‘sum’, ‘Expenses’: ‘sum’})

category_data[‘Profit’] = category_data[‘Revenue’] – category_data[‘Expenses’]

 

# Print the category data

print(category_data)

 

# Create an interactive chart using Plotly

fig = px.bar(category_data, x=category_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’, title=’Total Revenue, Expenses and Profit by Category’)

 

# Save the chart as an HTML file

chart_name = ‘category_chart.html’

fig.write_html(chart_name)

 

print(f’Chart saved to {chart_name}.’)

 

# Create an Excel report

report_name = ‘category_report.xlsx’

workbook = xlsxwriter.Workbook(report_name)

worksheet = workbook.add_worksheet()

 

# Write the category data to the report

worksheet.write(0, 0, ‘Category’)

worksheet.write(0, 1, ‘Revenue’)

worksheet.write(0, 2, ‘Expenses’)

worksheet.write(0, 3, ‘Profit’)

 

row = 1

for index, row_data in category_data.iterrows():

worksheet.write(row, 0, index)

worksheet.write(row, 1, row_data[‘Revenue’])

worksheet.write(row, 2, row_data[‘Expenses’])

worksheet.write(row, 3, row_data[‘Profit’])

row += 1

 

# Create a chart representing the data

chart = workbook.add_chart({‘type’: ‘column’})

 

chart.add_series({

‘name’: ‘Revenue’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 1, row – 1, 1],

})

 

chart.add_series({

‘name’: ‘Expenses’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 2, row – 1, 2],

})

 

chart.add_series({

‘name’: ‘Profit’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 3, row – 1, 3],

})

 

chart.set_title({‘name’: ‘Total Revenue, Expenses and Profit by Category’})

chart.set_x_axis({‘name’: ‘Category’})

chart.set_y_axis({‘name’: ‘Amount’})

chart.set_legend({‘position’: ‘bottom’})

 

worksheet.insert_chart(‘F1′, chart)

 

# Close the workbook

workbook.close()

 

print(f’Report saved to {report_name}.’)

 

# Country Statistics

 

# Calculate total revenue, expenses, and profit for each category

country_data = combined_data.groupby(‘Country’).agg({‘Revenue’: ‘sum’, ‘Expenses’: ‘sum’})

country_data[‘Profit’] = country_data[‘Revenue’] – country_data[‘Expenses’]

 

# Print the category data

print(country_data)

 

# Create an interactive chart using Plotly

fig = px.bar(country_data, x=country_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’, title=’Total Revenue, Expenses and Profit by Country’)

 

# Save the chart as an HTML file

chart_name = ‘country_chart.html’

fig.write_html(chart_name)

 

print(f’Chart saved to {chart_name}.’)

 

# Create an Excel report

report_name = ‘country_report.xlsx’

workbook = xlsxwriter.Workbook(report_name)

worksheet = workbook.add_worksheet()

 

# Write the category data to the report

worksheet.write(0, 0, ‘Country’)

worksheet.write(0, 1, ‘Revenue’)

worksheet.write(0, 2, ‘Expenses’)

worksheet.write(0, 3, ‘Profit’)

 

row = 1

for index, row_data in country_data.iterrows():

worksheet.write(row, 0, index)

worksheet.write(row, 1, row_data[‘Revenue’])

worksheet.write(row, 2, row_data[‘Expenses’])

worksheet.write(row, 3, row_data[‘Profit’])

row += 1

 

# Create a chart representing the data

chart = workbook.add_chart({‘type’: ‘column’})

 

chart.add_series({

‘name’: ‘Revenue’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 1, row – 1, 1],

})

 

chart.add_series({

‘name’: ‘Expenses’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 2, row – 1, 2],

})

 

chart.add_series({

‘name’: ‘Profit’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 3, row – 1, 3],

})

 

chart.set_title({‘name’: ‘Total Revenue, Expenses and Profit by Country’})

chart.set_x_axis({‘name’: ‘Country’})

chart.set_y_axis({‘name’: ‘Amount’})

chart.set_legend({‘position’: ‘bottom’})

 

worksheet.insert_chart(‘F1′, chart)

 

# Close the workbook

workbook.close()

 

print(f’Report saved to {report_name}.’)

 

# Create a Streamlit dashboard

st.set_page_config(page_title=’Total Revenue, Expenses and Profit’, page_icon=’:chart_with_upwards_trend:’)

st.title(‘Total Revenue, Expenses and Profit’)

 

st.subheader(‘Grouped Data by Category’)

 

# Create an interactive chart using Plotly and display it in the dashboard

fig = px.bar(category_data, x=category_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’)

st.plotly_chart(fig)

 

# Add a second graph

st.subheader(‘Grouped Data by Country’)

 

# Create an interactive chart using Plotly and display it in the dashboard for second graph

fig2 = px.bar(country_data, x=country_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’)

st.plotly_chart(fig2)

#Code ends here—————————

 

This code has been changed. But it has been based heavily on code and instructions that you can find in this video on YouTube : I Quit Coding – How I use ChatGPT instead as Data Analyst https://www.youtube.com/watch?v=IAWMMi_VFOI

YouTube Video credit : Coding is Fun https://www.youtube.com/@CodingIsFun

 

Music track: A Positive Direction by Aylex

Source: https://lnkd.in/d4gBFZEU

Royalty Free Music for Video (Safe)

 

#mindstormGR #chatGPT #coding #python #data_analysis / www.mindstorm.gr

Views: 17

Comments are closed.

Pin It