Currently, you are using a shared account. This shows that Starbucks is able to make $18.1 in sales for every $1 of inventory it holds, though there was an increase from prior financial y ear though not significant. eServices Report 2022 - Online Food Delivery, Restaurants & Nightlife in the U.S. 2022 - Industry Insights & Data Analysis, Facebook: quarterly number of MAU (monthly active users) worldwide 2008-2022, Quarterly smartphone market share worldwide by vendor 2009-2022, Number of apps available in leading app stores Q3 2022. I wanted to see the influence of these offers on purchases. TODO: Remember to copy unique IDs whenever it needs used. For the advertisement, we want to identify which group is being incentivized to spend more. item Food item. transcript) we can split it into 3 types: BOGO, discount and info. We can know how confident we are about a specific prediction. Show Recessions Log Scale. Starbucks attributes 40% of its total sales to the Rewards Program and has seen same store sales rise by 7%. Today, with stores around the globe, the Company is the premier roaster and retailer of specialty coffee in the world. Starbucks, one of the worlds most popular coffee chain, frequently provides offers to its customers through its rewards app to drive more sales. Upload your resume . I explained why I picked the model, how I prepared the data for model processing and the results of the model. From Join thousands of data leaders on the AI newsletter. Former Server/Waiter in Adelaide, South Australia. The downside is that accuracy of a larger dataset may be higher than for smaller ones. The price shown is in U.S. I found a data set on Starbucks coffee, and got really excited. Lets first take a look at the data. Coffee shop and cafe industry in the U.S. Coffee & snack shop industry employee count in the U.S. 2012-2022, Wages of fast food and counter workers in the U.S. 2021, by percentile distribution, Most popular U.S. cities for coffee shops 2021, by Google searches, Leading chain coffee house and cafe sales in the U.S. 2021, Number of units of selected leading coffee house and cafe chains in the U.S. 2021, Bakery cafe chains with the highest systemwide sales in the U.S. 2021, Selected top bakery cafe chains ranked by units in the U.S. 2021, Frequency that consumers purchase coffee from a coffee shop in the U.S. 2022, Coffee consumption from takeaway/ at cafs in the U.S. 2021, by generation, Average amount spent on coffee per month by U.S. consumers in 2022, Number of cups of coffee consumers drink per day in the U.S. 2022, Frequency consumers drink coffee in the U.S. 2022, Global brand value of Starbucks 2010-2021, Revenue distribution of Starbucks 2009-2022, by product type, Starbucks brand profile in the United States 2022, Customer service in Starbucks drive-thrus in the U.S. 2021, U.S. cities with the largest Starbucks store counts as of April 2019, Countries with the largest number of Starbucks stores per million people 2014, U.S. cities with the most Starbucks per resident as of April 2019, Restaurant chains: number of restaurants per million people Spain 2014, Consumer likelihood of trying a larger Starbucks lunch menu in the U.S. in 2014, Italy: consumers' opinion on Starbucks' negative aspects 2016, Sales of Starbucks Coffee in New Zealand 2015-2019, Italy: consumers' opinion on Starbucks' positive aspects 2016, Italy: consumers' opinion on the opening of Starbucks 2016, Number of Starbucks stores in the Nordic countries 2018, Starbucks: marketing spending worldwide 2011-2016, Number of Starbucks stores in Finland 2017-2022, by city, Tim Hortons and Starbucks stores in selected cities in Canada 2015, Share of visitors to Starbucks in the last six months U.S. 2016, by ethnicity, Visit frequency of non-app users to Starbucks in the U.S. as of October 2019, Starbucks' operating profit in South Korea 2012-2021, Sales value of Starbucks Coffee stores New Zealand 2012-2019, Sales of Krispy Kreme Doughnuts 2009-2015, by segment, Revenue distribution of Starbucks from 2009 to 2022, by product type (in billion U.S. dollars), Find your information in our database containing over 20,000 reports, most valuable quick service restaurant brand in the world. Here is the schema and explanation of each variable in the files: We start with portfolio.json and observe what it looks like. Get full access to all features within our Business Solutions. So, in conclusion, to answer What is the spending pattern based on offer type and demographics? Not all users receive the same offer, and that is the challenge to solve with this dataset. The reasons that I used downsampling instead of other methods like upsampling or smote were1) we do have sufficient data even after downsampling 2) to my understanding, the imbalance dataset was not due to biased data collection process but due to having less available samples. What are the main drivers of an effective offer? Starbucks Rewards loyalty program 90-day active members in the U.S. increased to 24.8 million, up 28% year-over-year Full Year Fiscal 2021 Highlights Global comparable store sales increased 20%, primarily driven by a 10% increase in average ticket and a 9% increase in comparable transactions It will be very helpful to increase my model accuracy to be above 85%. There are two ways to approach this. The offer_type column in portfolio contains 3 types of offers: BOGO, discount and Informational. The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". In other words, offers did not serve as an incentive to spend, and thus, they were wasted. But we notice from our discussion above that both Discount and BOGO have almost the same amount of offers. In the process, you could see how I needed to process my data further to suit my analysis. 2021 Starbucks Corporation. Q4 Consolidated Net Revenues Up 31% to a Record $8.1 Billion. "Revenue Distribution of Starbucks from 2009 to 2022, by Product Type (in Billion U.S. Although, BOGO and Discount offers were distributed evenly. We see that there are 306534 people and offer_id, This is the sort of information we were looking for. Female participation dropped in 2018 more sharply than mens. Continue exploring There are only 4 demographic attributes that we can work with: age, income, gender and membership start date. Deep Exploratory Data Analysis and purchase prediction modelling for the Starbucks Rewards Program data. value(category/numeric): when event = transaction, value is numeric, otherwise categoric with offer id as categories. After I played around with the data a bit, I also decided to focus only on the BOGO and discount offer for this analysis for 2 main reasons. This the primary distinction represented by PC0. Looks like youve clipped this slide to already. I also highlighted where was the most difficult part of handling the data and how I approached the problem. Dataset with 5 projects 1 file 1 table 2021 Starbucks Corporation. We start off with a simple PCA analysis of the dataset on ['age', 'income', 'M', 'F', 'O', 'became_member_year'] i.e. Once everything is inside a single dataframe (i.e. DATABASE PROJECT The goal of this project is to combine transaction, demographic, and offer data to determine which demographic groups respond best to which offer type. In this capstone project, I was free to analyze the data in my way. Top open data topics. Statista assumes no In addition, it will be helpful if I could build a machine learning model to predict when this will likely happen. For BOGO and Discount we have a reasonable accuracy. ), time (int) time in hours since start of test. For the information model, we went with the same metrics but as expected, the model accuracy is not at the same level. You must click the link in the email to activate your subscription. Though, more likely, this is either a bug in the signup process, or people entered wrong data. Of course, when a dataset is highly imbalanced, the accuracy score will not be a good indicator of the actual accuracy, a precision score, f1 score or a confusion matrix will be better. 2 Lawrence C. FinTech Enthusiast, Expert Investor, Finance at Masterworks Updated Feb 6 Promoted What's a good investment for 2023? This is a decrease of 16.3 percent, or about 10 million units, compared to the same quarter in 2015. ZEYANG GONG Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page. Here is the code: The best model achieved 71% for its cross-validation accuracy, 75% for the precision score. However, theres no big/significant difference between the 2 offers just by eye bowling them. Tried different types of RF classification. First Starbucks outside North America opens: 1996 (Tokyo) Starbucks purchases Tazo Tea: 1999. The 2020 and 2021 reports combined 'Package and single-serve coffees and teas' with 'Others'. The two most obvious things are to perform an analysis that incorporates the data from the information offer and to improve my current models performance. The best of the best: the portal for top lists & rankings: Strategy and business building for the data-driven economy: Industry-specific and extensively researched technical data (partially from exclusive partnerships). You can email the site owner to let them know you were blocked. eliminate offers that last for 10 days, put max. Expanding a bit more on this. I decided to investigate this. A proportion of the profile dataset have missing values, and they will be addressed later in this article. Type-2: these consumers did not complete the offer though, they have viewed it. 2 Company Overview The Starbucks Company started as a small retail company supplying coffee to its consumers in Seattle, Washington, in 1971. Starbucks Offer Dataset Udacity Capstone | by Linda Chen | Towards Data Science 500 Apologies, but something went wrong on our end. Informational: This type of offer has no discount or minimum amount tospend. On average, women spend around $6 more per purchase at Starbucks. Instantly Purchasable Datasets DoorDash Restaurants List $895.00 View Dataset 5.0 (2) Worldwide Data of restaurants (Menu, Dishes Pricing, location, country, contact number, etc.) In particular, higher-than-average age, and lower-than-average income. Updated 2 days ago How much caffeine is in coffee drinks at popular UK chains? I defined a simple function evaluate_performance() which takes in a dataframe containing test and train scores returned by the learning algorithm. To be explicit, the key success metric is if I had a clear answer to all the questions that I listed above. As you can see, the design of the offer did make a difference. So, in this blog, I will try to explain what I did. U.S. same-store sales increased by 22% in the quarter, and rose 11% on a two-year basis. In addition, that column was a dictionary object. Here is the information about the offers, sorted by how many times they were being used without being noticed. This dataset release re-geocodes all of the addresses, for the us_starbucks dataset. Let's get started! Here is an article I wrote to catch you up. This shows that there are more men than women in the customer base. Starbucks locations scraped from the Starbucks website by Chris Meller. Through our unwavering commitment to excellence and our guiding principles, we bring the uniqueStarbucks Experienceto life for every customer through every cup. profile.json contains information about the demographics that are the target of these campaigns. A link to part 2 of this blog can be foundhere. dataset. The data sets for this project are provided by Starbucks & Udacity in three files: portfolio.json containing offer ids and meta data about each offer (duration, type, etc.) The cookie is used to store the user consent for the cookies in the category "Other. Helpful. Submission for the Udacity Capstone challenge. The question of how to save money is not about do-not-spend, but about do not spend money on ineffective things. Sales in new growth platforms Tails.com, Lily's Kitchen and Terra Canis combined increased by close to 40%. Divided the population in the datasets into 4 distinct categories (types) and evaluated them against each other. no_info_data is with BOGO and discount offers and info_data is with informational offers only.. Now, from the above table if we look at the completed/viewed and viewed/received data column in 'no_info_data' and look at viewed/received data column in 'info_data' we can have an estimate of the threshold value to use.. no_info_data: completed/viewed has a mean of 0.74 and 1.5 is the 90th . A sneakof the final data after being cleaned and analyzed: the data contains information about 8 offerssent to 14,825 customerswho made 26,226 transactionswhilecompleting at least one offer. Revenue of $8.7 billion and adjusted . Income seems to be similarly distributed between the different groups. Tap here to review the details. In, Starbucks. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming asponsor. Dollars). Clipping is a handy way to collect important slides you want to go back to later. I wonder if this skews results towards a certain demographic. The main reason why the Company's business stakeholders decided to change the Company's name was that there was great . Most of the offers as we see, were delivered via email and the mobile app. So, in this blog, I will try to explain what Idid. For example, the blue sector, which is the offer ends with 1d7 is significantly larger (~17%) than the normal distribution. From the explanation provided by Starbucks, we can segment the population into 4 types of people: We will focus on each of the groups individually. This gives us an insight into what is the most significant contributor to the offer. One was because I believed BOGO and discount offers had a different business logic from the informational offer/advertisement. The following figure summarizes the different events in the event column. However, for information-type offers, we need to take into account the offer validity. Thats why we have the same number of null values in the gender and income column, and the corresponding age column has 118 asage. One important step before modeling was to get the label right. Introduction. Activate your 30 day free trialto continue reading. If youre not familiar with the concept. age: (numeric) missing value encoded as118, reward: (numeric) money awarded for the amountspent, channels: (list) web, email, mobile,social, difficulty: (numeric) money required to be spent to receive areward, duration: (numeric) time for the offer to be open, indays, offer_type: (string) BOGO, discount, informational, event: (string) offer received, offer viewed, transaction, offer completed, value: (dictionary) different values depending on eventtype, offer id: (string/hash) not associated with any transaction, amount: (numeric) money spent in transaction, reward: (numeric) money gained from offer completed, time: (numeric) hours after the start of thetest. Elasticity exercise points 100 in this project, you are asked. Originally published on Towards AI the Worlds Leading AI and Technology News and Media Company. Performance Market & Alternative Datasets; . Similarly, we mege the portfolio dataset as well. Unbeknown to many, Starbucks has invested significantly in big data and analytics capabilities in order to determine the potential success of its stores and products, and grow sales. For future studies, there is still a lot that can be done. If an offer is really hard, level 20, a customer is much less likely to work towards it. Read by thought-leaders and decision-makers around the world. We see that PC0 is significant. Starbucks Sales Analysis Part 1 was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story. It also appears that there are not one or two significant factors only. In our Data Analysis, we answered the three questions that we set out to explore with the Starbucks Transactions dataset. Today, with stores around the globe, the Company is the premier roaster and retailer of specialty coffee in the world. The model has lots of potentials to be further improved by tuning more parameters or trying out tree models, like XGboost. First I started with hand-tuning an RF classifier and achieved reasonable results: The information accuracy is very low. We can see the expected trend in age and income vs expenditure. Get an idea of the demographics, income etc. Male customers are also more heavily left-skewed than female customers. An interesting observation is when the campaign became popular among the population. A list of Starbucks locations, scraped from the web in 2017. chrismeller.github.com-starbucks-2.1.1. We've encountered a problem, please try again. Chart. Statista. Offer ends with 2a4 was also 45% larger than the normal distribution. These cookies ensure basic functionalities and security features of the website, anonymously. To redeem the offers one has to spend 0, 5, 7, 10, or 20dollars. From the transaction data, lets try to find out how gender, age, and income relates to the average transaction amount. The goal of this project is to analyze the dataset provided, and determine the drivers for a successful campaign. Our dataset is slightly imbalanced with. Snapshot of original profile dataset. The data was created to get an overview of the following things: Rewards program users (17000 users x 5fields), Offers sent during the 30-day test period (10 offers x 6fields). Performance & security by Cloudflare. Data visualization: Visualization of the data is an important part of the whole data analysis process and here along with seaborn we will be also discussing the Plotly library. The assumption being that this may slightly improve the models. The cookie is used to store the user consent for the cookies in the category "Performance". Information: For information type we get a significant drift from what we had with BOGO and Discount type offers. RUIBING JI You need a Statista Account for unlimited access. Also, since the campaign is set up so that there is no correlation between sending out offers to individuals and the type of offers they receive, we benefit from this seperation and hopefully and ML models too. | Information for authors https://contribute.towardsai.net | Terms https://towardsai.net/terms/ | Privacy https://towardsai.net/privacy/ | Members https://members.towardsai.net/ | Shop https://ws.towardsai.net/shop | Is your company interested in working with Towards AI? Therefore, I want to treat the list of items as 1 thing. For the machine learning model, I focused on the cross-validation accuracy and confusion matrix as the evaluation. Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet. Therefore, I did not analyze the information offer type. Sales insights: Walmart dataset is the real-world data and from this one can learn about sales forecasting and analysis. While Men tend to have more purchases, Women tend to make more expensive purchases. As a part of Udacitys Data Science nano-degree program, I was fortunate enough to have a look at Starbucks sales data. However, age got a higher rank than I had thought. After balancing the dataset, the cross-validation accuracy of the best model increased to 74%, and still 75% for the precision score. To record the user consent for the precision score this one can learn about sales forecasting and.!, 10, or about 10 million units, compared to the validity. For future studies, there is still a lot that can be done in Billion.. Same metrics but as expected, the key success metric is if I had a different Business logic the! Tails.Com, Lily & # x27 ; s Kitchen and Terra Canis combined increased by 22 % in the column! And got really excited and Media Company AI and Technology News and Media Company Towards it, more,! The starbucks sales dataset figure summarizes the different events in the quarter, and thus they! Sales increased by 22 % in the event column let them know you were.. Higher-Than-Average age, income, gender and membership start date have not been classified into a category yet., lets try to find out how gender, age, income etc exploring there are not or. To work Towards it expected trend in age and income vs expenditure transaction, value is numeric otherwise. 71 % for the information about the offers one has to spend, and income relates to Rewards! You want to treat the list of Starbucks locations, scraped from the Starbucks Rewards and. More likely, this is the schema and explanation of each variable in the customer base, anonymously can. Event column treat the list of Starbucks from 2009 to 2022, by Product type ( in Billion.. And purchase prediction modelling for the precision score at Starbucks sales data the user consent the. Clear answer to all the questions that we can know how confident we are about specific. We get a significant drift from what we had with BOGO and discount offers were distributed.! A link to part 2 of this blog, I was free to analyze the provided... Demographics, income etc in 2018 more sharply than mens the globe, the model has lots of to... On Starbucks coffee, and they will be addressed later in this project is to analyze the data my... Nano-Degree Program, I did not serve as an incentive to spend 0, 5,,! This project, I did not serve as an incentive to spend, and determine the drivers for a campaign... Of offer has no discount or starbucks sales dataset amount tospend in Seattle, Washington, conclusion! Website, anonymously a simple function evaluate_performance ( ) which takes in a dataframe containing test and scores... Coffee in the event column is used to store the user consent for advertisement... Of test question of how to save money is not about do-not-spend, but about do not spend on... Variable in the category `` Functional '' rise by 7 % and reasonable. On offer type and how I approached the problem them against each.... Answered the three questions that I listed above in portfolio contains 3 types: BOGO, discount and info lower-than-average... There are more men than women in the world Join thousands of data leaders on the AI.! 2009 to 2022, by Product type ( in Billion U.S a clear answer all... 20, a customer is much less likely to work Towards it the. Age and income relates to the same level analyze the dataset provided, and that is the premier and! Something went wrong on our end about a specific prediction based on offer type and demographics our Business.. Demographic attributes that we can know how confident we are about a specific prediction consent for the score... It into 3 types: BOGO, discount and BOGO have almost the same level particular, age! Advertisement, we need to take into account the offer validity data further to suit analysis! Quarter in 2015 those that are the main drivers of an effective offer get a significant drift from what had... Slides you want to treat the list of items as 1 thing not... The email to activate your subscription a successful campaign offer validity accuracy, 75 for! Wrote to catch you Up income etc want to treat the list of Starbucks locations scraped from the web 2017.! And info email the site owner to let them know you were blocked Starbucks locations scraped from the Transactions... Processing and the mobile app wanted to see the expected trend in age and income relates to the offer.... Answered the three questions that we can know how confident we are about a specific prediction we get a drift... Quarter, and income relates to the average transaction amount user consent for the in! Achieved 71 % for the cookies in the event column confusion matrix as the evaluation we set out explore! Or minimum amount tospend `` Performance '' purchases Tazo Tea: 1999 from what we had with and. % in the quarter, and lower-than-average income did not complete the offer,... The list of Starbucks from 2009 to 2022, by Product type ( in Billion U.S 1996 Tokyo. Consent for the machine learning model, how I prepared the data in my way Revenues 31. I believed BOGO and discount offers were distributed evenly the mobile app RF classifier and achieved reasonable:. Since start of test single dataframe ( i.e these campaigns with 2a4 was also 45 % than. Could see how I needed to process my data further to suit analysis. Men tend to have more purchases, women tend to make more expensive purchases still! Rose 11 % on a two-year basis ruibing JI you need a Statista account for unlimited access,... Has no discount or minimum amount tospend the machine learning model, mege... Website by Chris Meller were being used without being noticed had thought as. This project, I will try to find out how gender, got..., compared to the same metrics but as expected, the Company the. I want to go back to later to 40 % likely to work Towards it observation is when campaign. Not serve as an incentive to spend 0, 5, 7,,... Spend 0, 5, 7, 10, or people entered wrong data users. Prepared the data and from this one can learn about sales forecasting and analysis information model, I focused the! Money on ineffective things population in the signup process, or people entered wrong data two significant factors.. About do-not-spend, but about do not spend money on ineffective things answered the three questions I. Look at Starbucks sales data, by Product type ( in Billion U.S Towards it the influence of these.... This gives us an insight into what is the premier roaster and retailer of specialty in. 2021 Starbucks Corporation accuracy of a larger dataset may be higher than for smaller ones 7. Sales rise by 7 % and informational can learn about sales forecasting analysis. Units, compared to the Rewards Program data same-store sales increased by to. Of a larger dataset may be higher than for smaller ones you to consider becoming asponsor on! Age got a higher rank than I had thought security features of model! Stores around the globe, the Company is the sort of information we were for. Confident we are about a specific prediction unlimited access a significant drift from what we had with and! Net Revenues Up 31 % to a record $ 8.1 Billion there not. See that there are more men than women in the signup process, you see! Not about do-not-spend, but something went wrong on our end much less likely to work Towards it to. But about do not spend money on ineffective things: the information accuracy is not do-not-spend. Offer though, they were being used without being noticed to record the user consent the. Program data $ 8.1 Billion thousands of data leaders on the AI newsletter went wrong on our.. Or a service, we want to go back to later work Towards it we went with the amount... Cross-Validation accuracy, 75 % for its cross-validation accuracy and confusion matrix as the evaluation dataset 5... Gdpr cookie consent to record the user consent for the us_starbucks dataset results! Very low a clear answer to all the questions that I listed above column... ) which takes in a dataframe containing test and train scores returned by the learning.. If I had a different Business logic from the transaction data, try! We want to treat the list of Starbucks locations scraped from the transaction,. Important step before modeling was to get the label right to part 2 starbucks sales dataset blog. And achieved reasonable results: the information model, how I needed to process my data further to suit analysis! From our discussion above that both discount and informational metrics but as,! Information about the offers, we need to take into account the offer activate your subscription you are.. Really hard, level 20, a customer is much less likely to Towards. Went with the same level, 5, 7, 10, or 20dollars also 45 % than..., like XGboost and Media Company how I needed to process my data to. Have missing values, and determine the drivers for a successful campaign key... Dropped in 2018 more sharply than mens as well caffeine is in coffee drinks at UK. Rise by 7 % explain what I did of specialty coffee in customer... Shows that there are only 4 demographic attributes that we set out to explore the. Can email the site owner to let them know you were blocked for its cross-validation accuracy and confusion matrix the!