Business ideas in Augusta, Georgia using Foursquare API & IBM Watson Python analysis.
An application of opensource neighborhood census, and crime data using machine learning analysis in python, and a Foursquare API. This exposes untapped neighborhoods in Augusta, Georgia which lack profitable business opportunities.
Introduction
Augusta National Golf Club hosts the Masters’ tournament every year in early April. Those that qualify to play know this is a prestigious award coveted by all golf enthusiasts.
The ever-growing popularity of this event presents unique opportunities for new businesses that want to anticipate the benefits of fans from all around the world visiting, to city subsidies that promote the event and commercial sponsors competing to have their products and voices heard.
In this neighborhood analysis, we will look at the neighborhoods surrounding the golf course; initially, determine the safest parts of the city (low crime) and apply Foursquare data (business opportunities) to culminate a picture of where and what businesses can best be opened in a given area.
Georgia Economic Statistics
The population density in Georgia has been flat-lining in the past decade, despite annual GDP rates close to double the national rates of 2.4% and 2.9% in the last 2 years. Georgia has seen a rise in the per capita personal income but it has been below the national average, in particular below last year’s national US$53,820 per capita personal income.
This increase in GDP and lower personal incomes creates an opportunity for outside investors to come in to seek out opportunities in a state that have growth potential and lower wages.
Resources
- IBM Python Notebook – Run code and analysis.
- AWS Cloudformation – Web design and presentation.
- Github – Python code version control.
Method
Georgia, USA
Part 1
- Dataset containing neighborhoods surrounding Augusta National Golf Club. Through analysis, we will be able to point out which of these neighborhoods present the best business opportunities.
Locations of Interest – Dataset
[table id=1 /]
Locations of Interest – Map
- Click on the white circled blue pins on the map below to experience an interactive display, and right click map to refresh.
Neighborhoods Surrounding Golf Course
- Acquired neighborhood, census and crime data from Arcgis Neighborhood Dataset, Arcgis Census Dataset, and Arcgis Crime Dataset.
Crime Data – Sample Dataset
[table id=2 /]
- Crime data did not have a neighborhood column (as shown above), therefore we created pseudo-coordinates (seen below) corresponding to known neighborhoods and added a Neighborhood column to Crime Data using our “nearest Neighborhood” algorithm.
Nearest Neighborhood – Dataset
[table id=3 /]
- Once we linked the neighborhoods to their corresponding crime counts, this allowed us to analyze the proximity of crimes near the Golf course.
Part 2
- With data showing which neighborhoods were the safest to do business, we applied a Foursquare API to determine what businesses were most frequent in those neighborhoods.
- To do this the dataset had to be narrowed due to API call limitations; only focusing on the occurrence of the following crimes: Robbery, Theft, Vehicle Theft, Burglary, Criminal Damage, Carjacking, Fraud, and Shoplifting.
- The exclusion criteria for this list of crime used to filter neighborhood was based on what crimes were most likely to affect a new business.
Part 3
- We created a blog using AWS Cloudformation. This blog post does not have a registered domain and is currently listed as http://wp.projecthamburg.com/wordpress/ . It is an EC2 instance running on an Amazon Linux 2 AMI.
- Maps and spreadsheets were created using IBM Notebooks and the WordPress, TablePress plugin was used to present the tables on this page.
- Code is available in Github at https://github.com/bobbymcclane/Capstone-Project-Notebook-2019/blob/master/Capstone%20Project-Neighbourhood%20Analysis.ipynb
Results
- Sorting through the crime count data set below, Albion Acres recorded the highest number in a year with 8062, and Bath-Edie recording the least with 23 crimes committed.
Neigborhood Crime Count – Dataset
[table id=4 /]
Crime Count – Map
Augusta Crime Count
- The majority of crimes occurred between 5 -10 miles from Golf Course.
- Obtained the crime count corresponding to the category of crimes and determined which crimes were most important to the analysis .
Crime Category Count – Dataset
[table id=5 /]
- The tournament happens once a year in the first week of April.
- The month of April lies in the median range of crimes committed in a year.
Crimes by Month
[table id=6 /]
- New crime count results obtained from the neighborhoods that reported the following crimes: Robbery, Theft, Vehicle Theft, Burglary, Criminal Damage, Carjacking, Fraud, and Shoplifting.
New crime count – Dataset
[table id=7 /]
- Foursquare API results showing popular businesses in particular neighborhoods.
Neighborhoods and corresponding Venues – Sample Dataset
[table id=9 /]
- Foursquare API results showing top 10 most common businesses in particular neighborhoods.
Neighborhoods and corresponding top 10 most common businesses – Dataset
[table id=10 /]
- K-Means to cluster locations in 5 clusters results
Neighborhoods and corresponding cluster labels – Dataset
[table id=11 /]
Neighborhoods and corresponding cluster labels – Map
Neighborhoods and Cluster Labels
Discussion
In part 1 of this project, we accumulated data that showed which neighborhoods surrounding the golf course had the most and least crime. With that in mind, we can use this information and mirror it against the Foursquare API analysis.
In part 2 of this, we were able to categorize neighborhoods by listing the top 10 businesses in those corresponding areas using the Foursquare API call. Applying the K-means Clustering, machine learning algorithm. We were able to group neighborhoods into clusters based on their differences in their top 10 businesses as shown in the neighborhoods and corresponding cluster labels map above. This creates a unique opportunity for investors and business owners interested in Augusta. By selecting a neighborhood, one can see what business sector is lacking or oversaturated.
Conclusion
The US economy continues to be the most dominating economy in the world. This achievement makes investing in the US a safe investment. The city of Augusta takes pride in the national golf tournament every year in April and investors and business owners can accept a business-friendly state and city willing to accommodate them. If you are a business looking to expand or an investor looking to start one, it is worth evaluating using this analysis as a foundation in order to see if Augusta is the right place for you.
The template and python code for this capstone project can also be used for analysis in finance, DNA, stock market and healthcare-related projects to name a few. We welcome any questions and feedback below.