{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 1: PROBLEM STATEMENT AND PROJECT OVERVIEW\n", "\n", "* For investors to properly manage their portfolios, they need to visualize datasets, find useful patterns, and gain valuable insight such as stocks daily returns and risks.\n", "\n", "* In this project, we will use the power of Python to perform stock data analysis and stock return calculation." ] }, { "cell_type": "markdown", "metadata": { "id": "zAlupUSit5fK" }, "source": [ "# 2: IMPORT DATASETS AND LIBRARIES" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "_CUspuoRt25m" }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "import plotly.express as px\n", "import plotly.figure_factory as ff\n", "import plotly.graph_objects as go\n", "import os" ] }, { "cell_type": "markdown", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 415 }, "id": "G7-KNz6Dt9JS", "outputId": "b030a7c4-02b6-4aa7-f2d7-8144bb0a200a" }, "source": [ "### Read the stock data csv file, here's the list of the stocks considered:\n", "\n", "* AAPL = Apple \n", "* BA = Boeing \n", "* T = AT&T\n", "* MGM = MGM Resorts International (Hotel Industry)\n", "* AMZN = Amazon\n", "* IBM = IBM\n", "* TSLA = Tesla Motors\n", "* GOOG = Google \n", "* sp500 = US Stock Market Index(S&P 500 is a stock market index that measures the stock performance of 500 large companies listed on U.S. stock exchange)\n", "> Check the list of S&P 500 companies here: https://en.wikipedia.org/wiki/List_of_S%26P_500_companies\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#change directory\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Read data\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 415 }, "id": "QTg_POeCuUeE", "outputId": "46801634-de59-42c7-ae96-6fee2e9a5ad5" }, "outputs": [], "source": [ "# Sort the stock data by date\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 35 }, "id": "77rPtCs85NgU", "outputId": "c6055315-03dc-4720-e73a-554e0d1cc780" }, "outputs": [], "source": [ "# Print out the number of stocks\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 191 }, "id": "8wBR2CHA5O87", "outputId": "b873cb3b-fde8-4ab9-a438-de7213aeb81a" }, "outputs": [], "source": [ "# Print the name of stocks\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# list of stock names and count of stocks\n" ] }, { "cell_type": "markdown", "metadata": { "id": "ubohn_-54V0Y" }, "source": [ "**MINI CHALLENGE #1:** \n", "- **What is the average return of the S&P500?**\n", "- **Which stock or index has the minimum dispersion from the mean in dollar value?**\n", "- **What is the maximum price for AMZN stock over the specified time period?** " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "id": "qmba17fwuk83" }, "source": [ "# 3: PERFORM EXPLORATORY DATA ANALYSIS" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 208 }, "id": "igiwssGruHHH", "outputId": "412220ae-9e7a-43cf-b3ed-895caffe4946" }, "outputs": [], "source": [ "# Check if data contains any null values\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 312 }, "id": "0NCbEVYJuqND", "outputId": "7e02b336-f36a-435d-a3a2-aa5be89a0791" }, "outputs": [], "source": [ "# Getting dataframe info\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "_sYkHjZJuqTg" }, "outputs": [], "source": [ "# Define a function to plot the entire dataframe\n", "# The function takes in a dataframe df as an input argument and does not return anything back!\n", "# The function performs data visualization\n", "# Pandas works great with matplotlib, you can simply plot data directly from a Pandas DataFrame using plot() method\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 458 }, "id": "HBLjDdvS81xD", "outputId": "326770f4-d843-495c-d27a-1bc72a0375ff" }, "outputs": [], "source": [ "# Plot the data\n" ] }, { "cell_type": "markdown", "metadata": { "id": "yt7FZfV19_M5" }, "source": [ "**MINI CHALLENGE #2:**\n", "- **Print out normalized (scaled) stock prices** \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "id": "8L9gXztQ92TC" }, "source": [ "# 4: PERFORM INTERACTIVE DATA VISUALIZATION" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "yUcMVHQQuqQZ" }, "outputs": [], "source": [ "# Function to perform an interactive data plotting using plotly express\n", "# Plotly.express module which is imported as px includes functions that can plot interactive plots easily and effectively. \n", "# Every Plotly Express function uses graph objects internally and returns a plotly.graph_objects.Figure instance. \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 542 }, "id": "x3tsVq0Juz_b", "outputId": "35b4fcf8-d897-4e84-9f10-e7e1e25598c9" }, "outputs": [], "source": [ "# Plot interactive chart\n" ] }, { "cell_type": "markdown", "metadata": { "id": "eEst3QL0AGNc" }, "source": [ "**MINI CHALLENGE #3:**\n", "- **Plot normalized stock data in an interactive way**\n", "- **It seems that most stocks experienced massive drops in 2020, let's assume that you own 100 shares of the S&P500 and you bought them on Feb 19th, 2020. How much did you lose (in $) by March 23rd, 2020?**\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "id": "sy2lCzGi0Q9K" }, "source": [ "# 5: CALCULATE INDIVIDUAL STOCKS DAILY RETURNS" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 225 }, "id": "eAqHOkFREV-0", "outputId": "733bef8c-a598-4570-d03a-dcf34ab1a60b" }, "outputs": [], "source": [ "# Let's calculate daily return for a single security\n", "# Let's take the S&P500 as an example first\n" ] }, { "cell_type": "markdown", "metadata": { "id": "ZmweQxu1F9b9" }, "source": [ "**MINI CHALLENGE #4:**\n", "- **Calculate the daily return for Amazon stock.**\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "id": "E_Ldvu42F0yI" }, "source": [ "# 6: CALCULATE MULTIPLE STOCKS DAILY RETURNS" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "BAYcdk5_J1X6" }, "outputs": [], "source": [ "# Let's define a function to calculate stocks daily returns (for all stocks) \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "gFxXpwVfWPaC" }, "outputs": [], "source": [ "# Get the daily returns \n" ] }, { "cell_type": "markdown", "metadata": { "id": "k-OTQRikGnXK" }, "source": [ "**MINI CHALLENGE #5:**\n", "- **Plot the returns vs. time using both static and interactive plots.**\n", "- **What is the maximum daily return in % values?**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "id": "O0fakkYYINwa" }, "source": [ "# 7: CALCULATE THE CORRELATIONS BETWEEN DAILY RETURNS " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "yO74euRqT5l7" }, "outputs": [], "source": [ "# Daily Return Correlation\n" ] }, { "cell_type": "markdown", "metadata": { "id": "5BODhJ55Jxde" }, "source": [ "**MINI CHALLENGE #6:**\n", "- **What are the top 2 stocks that are positively correlated with the S&P500?**\n", "- **What is the correlation between Amazon and Boeing? Comment on your answer.**\n", "- **What is the correlation between MGM and Boeing? Comment on your answer.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "id": "6o_TJSknK0nT" }, "source": [ "# 8: PLOT THE HISTOGRAM FOR DAILY RETURNS" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 607 }, "id": "2XX_aVAXPOod", "outputId": "8270ef3d-6dc1-4285-c179-60111316d89d" }, "outputs": [], "source": [ "# Histogram of daily returns\n", "# Stock returns are normally distributed with zero mean \n", "# Notice how Tesla Standard deviation is high indicating a more volatile stock\n" ] }, { "cell_type": "markdown", "metadata": { "id": "BdLC2x_aLGOU" }, "source": [ "**MINI CHALLENGE #7:**\n", "- **Based on the histogram, which of the following stocks are more risky? T or TSLA?**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 329 }, "id": "C_HRCdtiQNR6", "outputId": "b310da07-421b-46b5-c3f0-ddae4b38e1c3" }, "outputs": [], "source": [ "# Group all data returns together in a list\n", "# Make a copy of the daily returns dataframe\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plotly's Python API contains a super pwerful module known as figure factory module \n", "# Figure factory module includes wrapper functions that create unique chart types such as interactive subplots \n", "# Check this out for more information: https://plotly.com/python/figure-factory-subplots/\n", "# Notice how Boeing, Tesla and MGM gains are losses are sometimes extreme! This will indicate a more risky investment and can be quantified later using Sharpe Ratio\n" ] }, { "cell_type": "markdown", "metadata": { "id": "KPq0NpLQ8HxL" }, "source": [ "### EXCELLENT JOB!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Disclaimer:**\n", "\n", "This is by no means is investment advice and we are simply trying to do some data analysis and that's all." ] } ], "metadata": { "colab": { "collapsed_sections": [], "name": "Finance 101 - Stocks Data Analysis and Visualization - Skeleton.ipynb", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" } }, "nbformat": 4, "nbformat_minor": 1 }