ALL BUSINESS DATA ANALYSIS ENGLISH ARTICLES MICROSOFT EXCEL TIPS PYTHON

Use of chatGPT and YouTube to become a Data Analyst

Use of chatGPT and YouTube to become a Data Analyst – Χρήση του chatGPT και του YouTube για να κάνει κάποιος ανάλυση δεδομένων

Code file: app.py

Folder: C:\PythonPrograms\data_analysis_with_chatGPT

 

1) Merge multiple excel files – Συγχώνευση πολλών αρχείων excel

data1.xlsx

data2.xlsx

data3.xlsx

data4.xlsx

 

2) Clean data – Καθαρισμός δεδομένων

 

3) Create two excel reports (Total Revenue, Expenses and Profit by Category) and (Total Revenue, Expenses and Profit by Country) – Δημιουργία δύο αρχείων excel (category_report.xlsx and country_report.xlsx)

 

4) Add a chart to category_report.xlsx – Προσθήκη γραφήματος στο αρχείο excel category_report.xlsx

Category report

5) Add a chart to country_report.xlsx – Προσθήκη γραφήματος στο αρχείο excel country_report.xlsx

6) Build two interactive plots (category_chart.html and country_chart.html)

Category report htmlCountry report html

7) Create a streamlit dashboard with 2 graphs – Δημιουργία ενός πίνακα προβολής και διαχείρισης streamlit με 2 γραφήματα

Streamlit presentation

με τη βοήθεια της γλώσσας Python – with Python.

 

Changed code by Trifonas Papadopoulos:

#Code starts here—————————

import xlsxwriter

import pandas as pd

import os

import plotly.express as px

import streamlit as st

 

# Define the path to the data folder

data_folder = ‘data’

 

# Get a list of all the Excel files in the data folder

excel_files = [f for f in os.listdir(data_folder) if f.endswith(‘.xlsx’)]

 

# Create an empty dataframe to hold the combined data

combined_data = pd.DataFrame()

 

# Loop through each Excel file and append its data to the combined_data dataframe

for file in excel_files:

# Load the Excel file into a dataframe

df = pd.read_excel(os.path.join(data_folder, file))

 

# Add a column to the dataframe to indicate the source file

df[‘Source’] = file

 

# Append the data to the combined_data dataframe

combined_data = pd.concat([combined_data, df], ignore_index=True)

 

# Drop duplicate rows

combined_data.drop_duplicates(inplace=True)

 

# Handle missing values

combined_data.dropna(inplace=True)

 

# Print the combined data

print(combined_data)

 

# Category Statistics

 

# Calculate total revenue, expenses, and profit for each category

category_data = combined_data.groupby(‘Category’).agg({‘Revenue’: ‘sum’, ‘Expenses’: ‘sum’})

category_data[‘Profit’] = category_data[‘Revenue’] – category_data[‘Expenses’]

 

# Print the category data

print(category_data)

 

# Create an interactive chart using Plotly

fig = px.bar(category_data, x=category_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’, title=’Total Revenue, Expenses and Profit by Category’)

 

# Save the chart as an HTML file

chart_name = ‘category_chart.html’

fig.write_html(chart_name)

 

print(f’Chart saved to {chart_name}.’)

 

# Create an Excel report

report_name = ‘category_report.xlsx’

workbook = xlsxwriter.Workbook(report_name)

worksheet = workbook.add_worksheet()

 

# Write the category data to the report

worksheet.write(0, 0, ‘Category’)

worksheet.write(0, 1, ‘Revenue’)

worksheet.write(0, 2, ‘Expenses’)

worksheet.write(0, 3, ‘Profit’)

 

row = 1

for index, row_data in category_data.iterrows():

worksheet.write(row, 0, index)

worksheet.write(row, 1, row_data[‘Revenue’])

worksheet.write(row, 2, row_data[‘Expenses’])

worksheet.write(row, 3, row_data[‘Profit’])

row += 1

 

# Create a chart representing the data

chart = workbook.add_chart({‘type’: ‘column’})

 

chart.add_series({

‘name’: ‘Revenue’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 1, row – 1, 1],

})

 

chart.add_series({

‘name’: ‘Expenses’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 2, row – 1, 2],

})

 

chart.add_series({

‘name’: ‘Profit’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 3, row – 1, 3],

})

 

chart.set_title({‘name’: ‘Total Revenue, Expenses and Profit by Category’})

chart.set_x_axis({‘name’: ‘Category’})

chart.set_y_axis({‘name’: ‘Amount’})

chart.set_legend({‘position’: ‘bottom’})

 

worksheet.insert_chart(‘F1′, chart)

 

# Close the workbook

workbook.close()

 

print(f’Report saved to {report_name}.’)

 

# Country Statistics

 

# Calculate total revenue, expenses, and profit for each category

country_data = combined_data.groupby(‘Country’).agg({‘Revenue’: ‘sum’, ‘Expenses’: ‘sum’})

country_data[‘Profit’] = country_data[‘Revenue’] – country_data[‘Expenses’]

 

# Print the category data

print(country_data)

 

# Create an interactive chart using Plotly

fig = px.bar(country_data, x=country_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’, title=’Total Revenue, Expenses and Profit by Country’)

 

# Save the chart as an HTML file

chart_name = ‘country_chart.html’

fig.write_html(chart_name)

 

print(f’Chart saved to {chart_name}.’)

 

# Create an Excel report

report_name = ‘country_report.xlsx’

workbook = xlsxwriter.Workbook(report_name)

worksheet = workbook.add_worksheet()

 

# Write the category data to the report

worksheet.write(0, 0, ‘Country’)

worksheet.write(0, 1, ‘Revenue’)

worksheet.write(0, 2, ‘Expenses’)

worksheet.write(0, 3, ‘Profit’)

 

row = 1

for index, row_data in country_data.iterrows():

worksheet.write(row, 0, index)

worksheet.write(row, 1, row_data[‘Revenue’])

worksheet.write(row, 2, row_data[‘Expenses’])

worksheet.write(row, 3, row_data[‘Profit’])

row += 1

 

# Create a chart representing the data

chart = workbook.add_chart({‘type’: ‘column’})

 

chart.add_series({

‘name’: ‘Revenue’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 1, row – 1, 1],

})

 

chart.add_series({

‘name’: ‘Expenses’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 2, row – 1, 2],

})

 

chart.add_series({

‘name’: ‘Profit’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 3, row – 1, 3],

})

 

chart.set_title({‘name’: ‘Total Revenue, Expenses and Profit by Country’})

chart.set_x_axis({‘name’: ‘Country’})

chart.set_y_axis({‘name’: ‘Amount’})

chart.set_legend({‘position’: ‘bottom’})

 

worksheet.insert_chart(‘F1′, chart)

 

# Close the workbook

workbook.close()

 

print(f’Report saved to {report_name}.’)

 

# Create a Streamlit dashboard

st.set_page_config(page_title=’Total Revenue, Expenses and Profit’, page_icon=’:chart_with_upwards_trend:’)

st.title(‘Total Revenue, Expenses and Profit’)

 

st.subheader(‘Grouped Data by Category’)

 

# Create an interactive chart using Plotly and display it in the dashboard

fig = px.bar(category_data, x=category_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’)

st.plotly_chart(fig)

 

# Add a second graph

st.subheader(‘Grouped Data by Country’)

 

# Create an interactive chart using Plotly and display it in the dashboard for second graph

fig2 = px.bar(country_data, x=country_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’)

st.plotly_chart(fig2)

#Code ends here—————————

 

This code has been changed. But it has been based heavily on code and instructions that you can find in this video on YouTube : I Quit Coding – How I use ChatGPT instead as Data Analyst https://www.youtube.com/watch?v=IAWMMi_VFOI

YouTube Video credit : Coding is Fun https://www.youtube.com/@CodingIsFun

 

Music track: A Positive Direction by Aylex

Source: https://lnkd.in/d4gBFZEU

Royalty Free Music for Video (Safe)

 

#mindstormGR #chatGPT #coding #python #data_analysis / www.mindstorm.gr

Views: 12

Comments are closed.

Pin It

By continuing to use the site, you agree to the use of cookies. / Συνεχίζοντας να χρησιμοποιείτε την ιστοσελίδα, συμφωνείτε με τη χρήση των cookies. more information / περισσότερες πληροφορίες

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close