Data Portfolio

Page 1

DATAANALYTICS

P O R T F O L I O

PROJECT 1. FARFETCH SALES IN US

In this project, I conducted an analysis of one day's sales data for the Farfetch company in the USA. To perform the analysis, I utilized Python Pandas for data extraction and analysis. Additionally, I used Tableau and Pandas plot to visualize the results

The project began with the initial step of adapting the provided data in CSV format for further analysis. This involved refining the spreadsheet by removing columns with redundant information. As the analysis was conducted in an improvisational manner, I chose to explore specific fields of interest.

First, I examined the most popular product categories and identified the 10-30 most indemand brands and subcategories. Furthermore, I investigated the ratio between discounted and non-discounted products, as well as the categories that experienced the highest proportion of discounted sales.

It is important to note that I did not conduct comparative analyses across different dates. As such, it is plausible that the observed significant differences in certain categories could be attributed to it being the first day of sales of the particular categories.

Overall, this analysis provides valuable insights into the sales patterns and dynamics of Farfetch in the USA on a single day. These findings can assist in making informed business decisions and understanding customer preferences

Amount of unique brands 3400

Currency: USD, EUR

Date: June 23, 2023

Destination: US

Total Amount of sold items: 588.825

Most expensive item: USD 1 407 796.68 pre-owned fine watches

From Audemars Piguet

Smallest price: pencil sharpener from makeup category - USD 6.43

1 day sales turnover in local currency (USD)465 992 100.54

<class pandas core frame DataFrame>

RangeIndex: 588825 entries, 0 to 588824

Data columns (total 14 columns):

# Column Non-Null Count Dtype

0

DATA

1

2

3

7

8

One of the most popularcategoriesDresses, and 10 most popular brends were sorted out

Day Dresses analysis, top 10 brands

--- ------ -------------- -----
Competence date 588825 non-null datetime64[ns]
Brand 588825 non-null object
Category 1 588825 non-null object
Category 2 588825 non-null object
4 Category 3 365920 non-null object
5 Product code 588825 non-null int64
title
6 Product
588825 non-null object
Product
URL
non-null
page
588825
object
Product image URL
588825 non-null object
currency
9 Full price in local
588825 non-null float64
EUR
10 Discounted price in local currency 588825 non-null float64 11 Full price in
588825 non-null float64
12 Discounted price in EUR 588825 non-null float64
13 Flag discounted 588825 non-null int64 dtypes: datetime64[ns](1), float64(4), int64(2), object(7) memory usage: 62 9+ MB
AMOUNT OF THE OBJECTS WERE DETERMINED IN THE BEGUINNING P A R O S H 293 ULLA JOHNSON 281 ZIMMERMANN 240 GANNI 222 DOLCE & GABBANA 210 DVF DIANE VON FURSTENBERG 203 B+AB 184 PATRIZIA PEPE 176 CAMILLA 173 RIXO 151 P A R O S H 498 DOLCE & GABBANA 465 SELF-PORTRAIT 346 ZIMMERMANN 333 ULLA JOHNSON 318 PATRIZIA PEPE 282 MICHELLE MASON 270 PINKO 256 GANNI 247 DVF DIANE VON FURSTENBERG 243
TYPE OF THE COLUMNS AND TOTAL
DRESSES DF = FF[FF["Category 2"]
"DRESSES"] DRESSES DF["Brand"] value counts() head (10)
==
DRESSES DF
FF[FF["Category 2"]
"DRESSES"] DRESSES DF["Brand"].value counts().head (10)
=
==

PERCENTAGE OF FULL PRICE ITEMS AND DISCOUNTED ITEMS

30 MOST POPULAR SHOE BRANDS, TOTAL SALES IN USD

discounted

PROJECT 1. FARFETCH SALES IN US
full price

ITEMS SOLD MOSTLY WITH DISCOUNTS, DISCOUNTS MIGHT HAVE BEEN THE PRIMARY DRIVER FOR SALES IN THESE CATEGORIES

SHOES AND ACTIVEWEAR WEREN'T AFFECTED MUCH WITH DISCOUNTS

PROJECT 2. TOP 100 FASTEST GROWING COMPANIES IN EUROPE ANALYSIS

30 european countries The oldest company established in Spain in 1898

There are 66 componies with less than or equal to 5 employees

39 Busines sectors

According to the next table, it is possible to get a short presentation about the data provided

EUC["Country"] nunique()

"Sectorial concentration across countries shows that countries with the largest income specialize or cover the majority of sectors. The technology sector is the most popular

This table shows business sectors popularity and growth since 1989 till the year 2020

plt.figure(figsize=(8,4))

sns histplot(data=UK,x='Sector',pal ette='Pastel5')

plt.xticks(rotation=90)

plt.title('employees Distribution in UK',fontsize = 20) plt show()

The same analysis was made via Tableau in a more attractive and simple way

Based on the data obtained, it is also possible to conduct a comparative analysis of the distribution of employees across sectors

Amount changes of companies ranked in 2020 and 2017

Next, I decided to investigate the UK market separately because it is no longer part of the European Union

To eliminate UK from the list of European Countries possible with the next command: data without uk = EUC[EUC['Country'] != 'UK'

Interesting facts: out of the total number of companies, 66 of them have 5 or fewer employees in their staff, with 10 of those companies among the first 100."

Italy has the biggest concentration of companies

Based on these two graphs, it can be concluded that positive changes are observed in all sectors in a more or less systematic manner However, sectors such as construction and financial services have switched positions, with the financial services sector surpassing construction in terms of revenue by the year 2020. Similarly, the transportation sector, which was previously ranked fifth in popularity, has declined to the tenth position

Comparison of the revenues between sectors

French bakery Profit Analysis

The main goal was to extract the necessary data to determine the best selling items in the bakery, as well as to determine the most popular days and times of the day among visitors. The use of such data will allow to optimize working hours, introduce some additional services and tailor the production of certain products only on the best-selling days, thereby reducing costs and emissions.

Python (Pandas)

I prefer to use Python Pandas for extracting the necessary data and for being able to present the information graphically through additional libraries like seaborn and matplotlib In this case, I created a column "sold per minute" , and to simplify further research, it was decided to order the time based on an interval of 60 minutes, which made it possible to find out the most popular visiting hours of the bakery

Excel (SQL)

In the Excel table, I made some improvements to replace the names of the days of the week, which are numbered from 0 to 6 by default in Python But it was also necessary to review and adapt all the data for further presentation in the scoreboard

Tableau

In Tableau I made research to determine the best-selling group of bakery products (top 20), as well as the definition of the most visited days and times of the day In fact, the data obtained is a repetition of what was done in Pandas Python, but it took less time

PROJECT 3. FRENCH BACKERY

Tableau

Most popular positions of bread

The most popular weekdays

The busiest and freest hours

Data extraction Pandas Python

Screenshots

Data extraction Pandas Python

Screenshots

PROJECT 2. CYCLISTIC DATA ANALYSIS - STUDY CASE

In this project, I conducted an analysis of one day's sales data for the Farfetch company in the USA. To perform the analysis, I utilized Python Pandas for data extraction and analysis. Additionally, I used Tableau and Pandas plot to visualize the results

The project began with the initial step of adapting the provided data in CSV format for further analysis. This involved refining the spreadsheet by removing columns with redundant information. As the analysis was conducted in an improvisational manner, I chose to explore specific fields of interest.

First, I examined the most popular product categories and identified the 10-30 most indemand brands and subcategories. Furthermore, I investigated the ratio between discounted and non-discounted products, as well as the categories that experienced the highest proportion of discounted sales.

It is important to note that I did not conduct comparative analyses across different dates. As such, it is plausible that the observed significant differences in certain categories could be attributed to it being the first day of sales of the particular categories.

Overall, this analysis provides valuable insights into the sales patterns and dynamics of Farfetch in the USA on a single day. These findings can assist in making informed business decisions and understanding customer preferences

TO EXTRACT NESESSARY INFORMATION NEXTDATA WAS ANALIZED # Column Non-Null Count Dtype --- ------ -------------- ----0 ride id 769204 non-null object 1 rideable type 769204 non-null object 2 started at 769204 non-null datetime64[ns] 3 ended at 769204 non-null datetime64[ns] 4 start station name 676260 non-null object 5 start station id 676260 non-null object 6 end station name 669052 non-null object 7 end station id 669052 non-null object 8 start lat 769204 non-null float64 9 start lng 769204 non-null float64 10 end lat 768149 non-null float64 11 end lng 768149 non-null float64 12 member casual 769204 non-null object dtypes: datetime64[ns](2), float64(4), object(7)
2023 Thank you Regina Zuyeva regina zuyeva@gmail.com +351 969135262

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.
Data Portfolio by Regina Zuyeva - Issuu