mpg/eda.ipynb

1996 lines
151 KiB
Text
Raw Normal View History

2022-07-21 16:31:53 -04:00
{
"cells": [
{
"cell_type": "markdown",
"id": "807a5642-e420-4208-9ccb-bcc442617ad0",
"metadata": {},
"source": [
"[Cleaning](clean.ipynb)"
]
},
{
"cell_type": "markdown",
"id": "04ed2aa6-7b64-4c2d-b007-594e31ecdae8",
"metadata": {},
"source": [
"# EDA"
]
},
{
"cell_type": "markdown",
"id": "682a6d42-8ce0-42fb-9892-b6a46beb0b9b",
"metadata": {},
"source": [
"Import and define some functions"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "5ffa8b01-0b17-4ad8-8e85-f2656da50c9e",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:44.692975Z",
"iopub.status.busy": "2022-08-01T14:48:44.692461Z",
"iopub.status.idle": "2022-08-01T14:48:46.367116Z",
"shell.execute_reply": "2022-08-01T14:48:46.366301Z",
"shell.execute_reply.started": "2022-08-01T14:48:44.692899Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"from os.path import exists\n",
"import matplotlib.pyplot as plt\n",
"from IPython.display import display, Markdown\n",
"\n",
"sns.set_theme(style='darkgrid')\n",
"\n",
"df = pd.read_csv('data/clean.csv')\n",
"y = df.mpg\n",
"\n",
"def show_plots(filenames):\n",
" for j in range(0,len(filenames),2):\n",
" if (len(filenames)-j)>1:\n",
" display(Markdown(f'![]({filenames[j]})![]({filenames[j+1]})'))\n",
" else:\n",
" display(Markdown(f'![]({filenames[j]})'))\n",
"\n",
"def make_plots(df, y):\n",
" filenames = []\n",
" \n",
" for col in df.columns:\n",
" filename = 'img/%s_joint.png' % col\n",
" filenames.append(filename)\n",
" if not exists(filename):\n",
" sns.jointplot(x=df[col],y=y,kind='reg',\n",
" joint_kws={'scatter_kws':dict(alpha=0.3)})\n",
" plt.suptitle(f'{col} vs mpg')\n",
" plt.subplots_adjust(top=.93)\n",
" plt.savefig(filename,facecolor='white',transparent=False)\n",
" plt.close()\n",
" \n",
" show_plots(filenames)"
]
},
2022-08-01 09:32:07 -04:00
{
"cell_type": "markdown",
"id": "416a9d7e-e2ad-41f0-a674-d13c01f41896",
"metadata": {},
"source": [
"## A bit on engines:\n",
"\n",
"* A most basic description of an engine is that it's an air pump\n",
"* Horsepower = (Torque * RPM) / 5252\n",
"* Torque peak is where an engine is operating most efficiently as far as air flow, applied science in action. (Fluid dynamics, resonance)\n",
"* Operating above or below the torque peak reduces efficiency and efficiency == fuel economy\n",
"* Torque peaks normally occur below 5252rpm, and horsepower peaks above that, so long as the engine can actually rev that high. On a dyno sheet (measuring torque and horsepower vs rpm) you'll see the torque/horsepower lines cross at 5252rpm\n",
"* As an engine spins faster, the power output increases until combustion is so inefficient and it produces so little torque that spinning faster produces no more power, if it holds together that long\n",
"\n",
"Basically an engine that makes lots of power at high rpm but relatively little low end torque (mazda rotary), is going to have poor fuel economy because it spends most of its time outside of its efficiency range. In contrast, diesel engines typically turn lower rpms and create all kinds of torque down low. So not only do they start off making more torque but they are less likely to stray very far from torque peak. This is also why horsepower numbers on a diesel appear low, because they can't rev as high. There's more to it but this should be enough to provide context"
]
},
2022-07-21 16:31:53 -04:00
{
"cell_type": "markdown",
"id": "7af7dcdd-9618-4e81-88c8-d2c2cde0fdc2",
"metadata": {},
"source": [
"So I'm only interested in a few things:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "3e633f5f-8a7f-4776-a855-f22fcb87e88d",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:46.370085Z",
"iopub.status.busy": "2022-08-01T14:48:46.369414Z",
"iopub.status.idle": "2022-08-01T14:48:48.721115Z",
"shell.execute_reply": "2022-08-01T14:48:48.720341Z",
"shell.execute_reply.started": "2022-08-01T14:48:46.370055Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/markdown": [
"![](img/cylinders_joint.png)![](img/displacement_joint.png)"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
"![](img/horsepower_joint.png)![](img/weight_joint.png)"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"make_plots(df[['cylinders','displacement',\n",
" 'horsepower','weight',]],y)"
]
},
{
"cell_type": "markdown",
"id": "b0f65dd4-16b6-4222-8758-71e2ecac473e",
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"As the number of cylinders, displacement, horsepower, or weight increase, MPG goes down. There are some outliers, we'll get to that in a minute"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "markdown",
"id": "61b1b79e-46c2-4e7b-b565-84d1e2045777",
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"There are some other things I'd like to see:"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "7342da99-d04a-4f4f-ad3c-06840144ec48",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:48.722950Z",
"iopub.status.busy": "2022-08-01T14:48:48.722336Z",
"iopub.status.idle": "2022-08-01T14:48:48.731736Z",
"shell.execute_reply": "2022-08-01T14:48:48.730917Z",
"shell.execute_reply.started": "2022-08-01T14:48:48.722919Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [],
"source": [
"new_features = pd.DataFrame()\n",
"new_features['efficiency'] = df.horsepower / df.displacement\n",
"new_features['load'] = df.displacement / df.weight\n",
"new_features['bore_size'] = df.displacement / df.cylinders\n",
2022-08-01 11:39:46 -04:00
"new_features['grunt'] = (df.horsepower / new_features.bore_size) * new_features.efficiency"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "9fa0bf3e-d45b-4698-afac-e549db0de148",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:48.733546Z",
"iopub.status.busy": "2022-08-01T14:48:48.732929Z",
"iopub.status.idle": "2022-08-01T14:48:51.073977Z",
"shell.execute_reply": "2022-08-01T14:48:51.073212Z",
"shell.execute_reply.started": "2022-08-01T14:48:48.733511Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/markdown": [
"![](img/efficiency_joint.png)![](img/load_joint.png)"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
"![](img/bore_size_joint.png)![](img/grunt_joint.png)"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
2022-08-01 09:32:07 -04:00
"make_plots(new_features,y)"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "markdown",
"id": "5cbe16d7-24ef-4ceb-acd1-0dcecfdc96c2",
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"* Efficiency (HP per cubic inch) is a rough measure of engine tech/efficiency, as this increases so does MPG\n",
"* Load is a metric of how hard the engine has to work compared to its size. Engines that work hard use more fuel and a small engine working really hard can use more fuel than a big engine that's not doing much\n",
2022-07-21 16:31:53 -04:00
"* Bore_size is an attempt to describe cylinder bore diameter which gives insight on torque curve\n",
"* Grunt is an attempt to describe the power curve of an engine, or more specifically the presence/absence of low rpm torque output"
]
},
{
"cell_type": "markdown",
2022-08-01 09:32:07 -04:00
"id": "dd05abcd-9ac9-4821-b575-ffbf8544db3c",
2022-07-21 16:31:53 -04:00
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"Merge new with the old"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
2022-08-01 09:32:07 -04:00
"execution_count": 5,
"id": "89cea145-4b6e-457b-9970-578144c1c364",
2022-07-21 16:31:53 -04:00
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:51.075722Z",
"iopub.status.busy": "2022-08-01T14:48:51.075128Z",
"iopub.status.idle": "2022-08-01T14:48:51.082116Z",
"shell.execute_reply": "2022-08-01T14:48:51.081046Z",
"shell.execute_reply.started": "2022-08-01T14:48:51.075693Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
2022-08-01 09:32:07 -04:00
"outputs": [],
2022-07-21 16:31:53 -04:00
"source": [
2022-08-01 09:32:07 -04:00
"merged = df.join(new_features)\n",
"del new_features\n",
"del df"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "markdown",
"id": "d39b59e4-e596-4fc9-b886-1e6d314f597e",
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"# What's all that on the edges?\n",
"<hr>"
]
},
{
"cell_type": "markdown",
"id": "fe7ee071-8aa4-4a8d-9e8e-480f3b9da9da",
"metadata": {},
"source": [
"## Rotaries"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
2022-08-01 09:32:07 -04:00
"execution_count": 6,
2022-07-21 16:31:53 -04:00
"id": "dbbfdab6-1cca-4329-a2ae-9258678ab0b1",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:51.084010Z",
"iopub.status.busy": "2022-08-01T14:48:51.083431Z",
"iopub.status.idle": "2022-08-01T14:48:51.116081Z",
"shell.execute_reply": "2022-08-01T14:48:51.115133Z",
"shell.execute_reply.started": "2022-08-01T14:48:51.083974Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mpg</th>\n",
" <th>cylinders</th>\n",
" <th>displacement</th>\n",
" <th>horsepower</th>\n",
" <th>weight</th>\n",
" <th>acceleration</th>\n",
" <th>model_year</th>\n",
" <th>origin</th>\n",
" <th>car_name</th>\n",
" <th>efficiency</th>\n",
" <th>load</th>\n",
" <th>bore_size</th>\n",
" <th>grunt</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>71</th>\n",
" <td>19.0</td>\n",
" <td>3</td>\n",
" <td>70.0</td>\n",
" <td>97.0</td>\n",
" <td>2330.0</td>\n",
" <td>13.5</td>\n",
" <td>72</td>\n",
" <td>3</td>\n",
" <td>mazda rx2 coupe</td>\n",
" <td>1.385714</td>\n",
" <td>0.030043</td>\n",
" <td>23.333333</td>\n",
2022-08-01 11:39:46 -04:00
" <td>5.760612</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
" <th>111</th>\n",
" <td>18.0</td>\n",
" <td>3</td>\n",
" <td>70.0</td>\n",
" <td>90.0</td>\n",
" <td>2124.0</td>\n",
" <td>13.5</td>\n",
" <td>73</td>\n",
" <td>3</td>\n",
" <td>maxda rx3</td>\n",
" <td>1.285714</td>\n",
" <td>0.032957</td>\n",
" <td>23.333333</td>\n",
2022-08-01 11:39:46 -04:00
" <td>4.959184</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
" <th>243</th>\n",
" <td>21.5</td>\n",
" <td>3</td>\n",
" <td>80.0</td>\n",
" <td>110.0</td>\n",
" <td>2720.0</td>\n",
" <td>13.5</td>\n",
" <td>77</td>\n",
" <td>3</td>\n",
" <td>mazda rx-4</td>\n",
" <td>1.375000</td>\n",
" <td>0.029412</td>\n",
" <td>26.666667</td>\n",
2022-08-01 11:39:46 -04:00
" <td>5.671875</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
" <th>334</th>\n",
" <td>23.7</td>\n",
" <td>3</td>\n",
" <td>70.0</td>\n",
" <td>100.0</td>\n",
" <td>2420.0</td>\n",
" <td>12.5</td>\n",
" <td>80</td>\n",
" <td>3</td>\n",
" <td>mazda rx-7 gs</td>\n",
" <td>1.428571</td>\n",
" <td>0.028926</td>\n",
" <td>23.333333</td>\n",
2022-08-01 11:39:46 -04:00
" <td>6.122449</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" mpg cylinders displacement horsepower weight acceleration \\\n",
"71 19.0 3 70.0 97.0 2330.0 13.5 \n",
"111 18.0 3 70.0 90.0 2124.0 13.5 \n",
"243 21.5 3 80.0 110.0 2720.0 13.5 \n",
"334 23.7 3 70.0 100.0 2420.0 12.5 \n",
"\n",
" model_year origin car_name efficiency load bore_size \\\n",
"71 72 3 mazda rx2 coupe 1.385714 0.030043 23.333333 \n",
"111 73 3 maxda rx3 1.285714 0.032957 23.333333 \n",
"243 77 3 mazda rx-4 1.375000 0.029412 26.666667 \n",
"334 80 3 mazda rx-7 gs 1.428571 0.028926 23.333333 \n",
"\n",
2022-08-01 11:39:46 -04:00
" grunt \n",
"71 5.760612 \n",
"111 4.959184 \n",
"243 5.671875 \n",
"334 6.122449 "
2022-07-21 16:31:53 -04:00
]
},
2022-08-01 09:32:07 -04:00
"execution_count": 6,
2022-07-21 16:31:53 -04:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
2022-08-01 09:32:07 -04:00
"wankels = merged[merged.efficiency>1]\n",
"wankels"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "markdown",
"id": "d1f1bf61-6c9b-498e-a5de-6fbe6bb719d3",
"metadata": {},
"source": [
"These are the Mazda rotaries, otherwise known as [Wankel Engines](https://en.wikipedia.org/wiki/Wankel_engine)\n",
"\n",
2022-08-01 09:32:07 -04:00
"Efficient power for their size because they can rev to 7000rpm or so, and that's where they make peak power"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "5eda8f40-bff6-4715-ba54-b083c74b039d",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:51.121197Z",
"iopub.status.busy": "2022-08-01T14:48:51.120558Z",
"iopub.status.idle": "2022-08-01T14:48:51.303457Z",
"shell.execute_reply": "2022-08-01T14:48:51.302625Z",
"shell.execute_reply.started": "2022-08-01T14:48:51.121166Z"
2022-08-01 09:32:07 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAELCAYAAADTK53JAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAAgi0lEQVR4nO3de7xmc93/8dfew6BmHNpGcu6m+ZQYx0kqiSipkEhOQ6QopHT7yR1NSiRKisgppxluyohG7twRisgxp7fzmWw7hxnhzuz9++P7vVhzufa+rr332qc17+fjsR97X+v6ru/6XN+11md913eta+22np4ezMysetpHOgAzMxsaTvBmZhXlBG9mVlFO8GZmFeUEb2ZWUU7wZmYVtdBIB7Cgi4hfAY9L+vZIxzKSIuIzwPHAUsBGwL+A84DVgP8CVgeekPS9JvVcBpwn6cyhjbh1EXEk8A9Jx5VU38PAFyVd0eC9X5G3p4jYCDhVUpSx3LrlzAWmSHqw7LpHu4i4AfiCpDtHOpZmnODr5J1nOWA5Sc8Wpt8KrAW8U9LDIxJcnRzr24F5wFzg98C+kua2OG/DJDEUcuLZCfi/wuQHJK2V/z6GFPvFufxpwFWS1unPciR9ooRwSxMRk4BppAPVsJJ0DVB6cs91TxiKeseIY4DDgc+OdCDNeIimsYeAHWsvImJNYLGRC6dPn84729rAOsC3hmOhETFuALMdLWlC4WetwnsrA3f28Xqs2h2YLenlRm9GhDtZJRuGNv0tsElEvGOIlzNo3rgaO5vU6/pZfr0bcBbw/VqBiPhkfr0q8AJwmqTp+b2fk3bsmkWB70uaHhHrAKcB7wJmAz2FOpfKy96AtG7+DOwt6fFmAUt6OiIuJyX6Wn1bAUcCywO3AvtIujsizgZWAi6JiHnA4ZKOjogLSMMjiwG35fJ35rp+BbxMSrwbA9Mj4pvA8pJey2U+Cxwq6fUYmomIRYAuYBxwW0Q8DTySl/GhiDgOWBc4hMJQVkRsDXwX+A+gE/iqpN9HxFXAOZJOzeX2AP4TWBa4AfiSpEfyez3APsCBwNLADNJZRE9+fy/gG8AKwGPALsBHgfdLer33FhE/A+ZJOqDBR/wEcHqh7EeAc0jb1teBP0TEbsBBwF7AksD/ktb7P/M8u5K2tQnAj/vRth/JbbFCfv0w8HPStr0y6YxvN0mv5Pc/lZezCnBXjuH2XuruAd4l6f6I2JLUq10ReBH4iaRjGsyzKnAK6Uy4B7ictN6ej4iDgfUlbVco/1OgTdL+EbFE/uxbAt3AGcB3JM2LiN1z291A2ldPjIgzeltWrntd0n64Wm6HbuC+wvbVa1tIeiUibgI+BoyaocBG3INv7Hpg8Yh4T+6p7kDaKYteIu0oSwKfBPaJiG0AJO1b66UCHwKeAy6OiPHALFISfxtwAfOf5rWTNtyVSQn4ZdIO2VRErEBKJvfn15OBmcABwCTSweSSiBgvaVfgUXLvX9LRuZrLSAeeZYCbgXPrFrMTcAQwkZSguoDNC+/vkj9byyS9WjjdX0vSqpI2Ba4hJdsJku6t+6zvIx1w/5PU/h8GHq6vO6+PQ4BtcxtcQ2qTok8BU0mJ4HPAx/O82wPTSet4cWCr/HnPAbaIiCVzuYVI20dvn3tNQHXTliWt/5WBLwH7A9uQDmrLkbaXE3L9qwO/AHbN73WQDjgD9TlgC+CdwBRyRyQnvNOBL+dlnAz8Nh+AmzkN+LKkicAawB97KddG6nAsB7yHdECYnt+bCWwZEYvneMblWGfk988EXiMl5HVIyfWLhbo3AB4kbbtH9LWsvB9eBPyKtB5mAp+pVdRiW9xN2mZGNffge1frxf8JuAd4ovimpKsKL2+PiJmkHXRWbWIef50F7Cfploj4MLAwcFzuJV4YEd8o1NkF/Low/xHAlU3inJV7UxNIO9Z38vQdgN9J+kOu6xjga8AHgKsa1IOkYk9zOvBcRCwh6YU8+WJJf85/vxIRZ5KS+mUR8TZScvxKH7F+MyL2Lby+WNJuTT5fI3sCp9c+G3XrpuDLwJGS7gaIiB8Ah0TEyrVePHBU7tU9HxFXks6Afk9KHkdLujGXu79WaURcDWxP6iFuATwr6aZeYlgSmFM3rZvU+3w11/dl0sHs8fx6OvBo7rlvB1wq6er83qHAvgzc8ZKezHVdwhtnfHsBJ0v6a359ZkQcAryftA/05d/A6hFxm6TnSAeoN5F0P2+0Y2dE/Ji8vUp6JCJuJh3ozgI2Bf4l6fqIeDup87JkHup6KSJ+Qjo4npzre1JS7Yz7tbychsvKn2mh3BY9wG/yhdOaVtpiDuAhmjHsbOBqUk/nrPo3I2ID4ChSj2U8sAipR157f2HgQmCGpPPy5OVId4IUn/D2SGGetwA/ISWNpfLkiRExTtK8XuLcRtIVEbExqbezNPB8XtbrdUvqjojHSMM1b5J7TEeQEtckUhIi11dL8I/VzXYOcHdETCD1tq6R9FQvcQIcU9LdQiuSzkiaWRn4aUQcW5jWRmqDWts8XXjvX6QDZW0ZD/RS75mkoZ1TaH7W8hzpjKeoszYsUojzoojoLkybR7qAvhyFdpf0UkR09bG8Zuo/73KFGHaLiP0K748vvN+XzwLfBo6KiNuBgyVdV18oIpYh3Sm1EalN2pn/YDCDdO3rLNLZYq33vjKpY/RUxOvXjNuZf3ucb9tssqxG+2Fx/lbaYiJpPxvVnOB7kXsUD5HG/PZsUGQGafjkE3lM7jhSMqz5GekoX0xoTwHLR0RbYeNaiTcSyYGkux42yGPqawO3kJJSs3j/lMfJjyH1gp4kDQ8AEBFtpKRV6+3WP0Z0J2BrYDPScMcSpB2iuOz55pH0RERcRzq93ZU0lDAcHiNd+2il3BGS6oeaBruMWcAvImIN0hDPQX3UczswGbixMK2+7R8D9iicHb0uIp4iDTHUXr+FNGxQtlpbHdHfGfNZzta5U7Mv8N+kba3ekaTPPkVSVx5CKw5BXgAcm4cbPwNsWIjtVWDp2vWeBurbtK9lNdoPiwf0VtriPbx52HbU8Rh83/YENpX0UoP3JgL/zMn9faQECbx+yr0xsJOkYq/sOtLp4/4RsVBEbAu8r67Ol0nDBW/jjVPKVh0HbJ4PDP8NfDIiPpp3vANJO8lfctl/kC5QFpf9Kmmc+S3AD1pc5lmkBLcmaVxzOJwGfCF/tvaIWD4i3t2g3EnAtyLivQARsUQeW2/FqaQhpfUioi0iVouIlSFdZCOfnQE3SHq0j3pmk7aFvpwEHFGrPyImRbqITF7OpyLiQ3ns+HCGZr89Bdg7IjbIn/etEfHJiKg/+5hPRIyPiJ3zUN6/SRdZezvbnEi6nff5iFiedA3ldZI6ScOHZwAP1YbW8lnh/5CS/+J5na+az1p709eyrssx7pv3w62Zfz/ssy3yWPx6wB8Y5Zzg+yDpAUl/6+XtrwCHR8Qc4DBSQq3ZkZQ8n4yIufnnEEn/R7rgtzupd7wD8JvCfMeR7mB5lnSh9/f9jLeTlHAPlSTS8MHPcn2fJl1Urd2HfiTw7Yh4PtLdMGeRhi2eIN01cH2Li72IPMTQy4Gw6KBCe8yNiGeblG9I0g3AF0jDWS+QxkVXblDuIuCHwHkR8SJwB2kst5VlXEAasppBOhObRbogV3Mm6aDW7KLyWaSLh33dZvtT0q13/5O3p+tJFw3JdzF9NcfxFGm7aXpXVX/l7XwvUi/3OdL49e4tzr4r8HBu471J210j3yXdEfUC8Dvm3/ZrZpDOImfUTZ9GGia5K8d3IX2Pgfe6rMJ+uCdpmGUX4FJSB6eVttiK9B2NJ/tY/qjQ5n/4YYMVEQ+Q7qIYli9NjQYRsRLp4vuykl5sUvYHwDMq6ZusVr6I+CtwkqQzWiy7p6Q7hj6ywXGCt0GJdO/7D4HJdcNRlRUR7aR7sheXtMdIx2P9l4d3RDq73Zk0TPYfTW4SGHN8kdUGLNKXilYHdl2AkvtbSdcvHiHd7WRjU5CGVSeQLq5uV7XkDu7Bm5lVli+ymplV1GgZolmE9HXxp+j9FiszM5vfONLdRDeS7wIqGi0JfirpOSFmZtZ/GwHX1k8
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"wankels.efficiency.plot(kind='bar')\n",
"plt.xticks(np.arange(4),wankels.car_name)\n",
"pd.Series([merged['efficiency'].mean() for i in range(len(wankels))]).plot(kind='line',color='red')\n",
"plt.title('Mazda Rotary Efficiency (red line is average)');"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "1476617f-8097-4294-8c42-fb86ff96c1d0",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:51.304784Z",
"iopub.status.busy": "2022-08-01T14:48:51.304504Z",
"iopub.status.idle": "2022-08-01T14:48:51.460139Z",
"shell.execute_reply": "2022-08-01T14:48:51.459313Z",
"shell.execute_reply.started": "2022-08-01T14:48:51.304758Z"
2022-08-01 09:32:07 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAELCAYAAADJF31HAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAAbP0lEQVR4nO3deZhcVZ3/8Xd3BwKYkMTQwQEUGCBfRbawiKOgDIPbKIuIQiAEfgqyCMjgPC4MYmBkFIQR2QZG2SERRUE2ZQQFicLoCAFZ/AgIGiGModkCQ6Kk8/vjnIbbTVdXdXd1d+rweT1PP911695zT51763PPPXXrdtuKFSswM7PW1z7WFTAzs+ZwoJuZFcKBbmZWCAe6mVkhHOhmZoVwoJuZFcKB3mIi4qKI+PJY16N0EfHziJjRpLI2iIgVETGuxvOPRsQu+e9jI+JbzVhvn3XsGBFqdrmtICLWjogHImL8WNdlpPW7g1njIuJRYB1gHUlPVqYvALYENpT06JhUro9c17WB5cDzwI+AIyQ93+CyB0m6aSTrWFnfRcABwO6SrqlMPx34NPD/JF0UEQcC5wMvAt3A74HjJF2X558InADsCXQCXcAvgVMk/bLGuncFlki6a0Re3AAk/dsIlXsbECNR9spO0v9GxE+BTwJnjnV9RpJ76M3xCDCz50FEbA6sPnbVGdCukiYAWwEzgC+MxkojomMIi/2OFOo9ZYwDPgo83Ge+2/NrmkwK9+9ExOtzj+wnwObAh4A1gbcA3wb+cYD1HgpcWuvJWj1tG7oh7h+DcTlwyAivY8x5x2yOS4HZvHL0PwC4BHh5aCQiPpgfbwQ8C5wvaU5+7izgwEp5qwFfljQnn/afD2wC3ACsqJQ5Ja97e9K2/DlwqKQ/1auwpCci4kZSsPeUtxvwFWBdYAFwmKQHIuJS4E3AtRGxHDhR0ikR8V1gR9LB6+48/325rItIveb1gXcDcyLin4F1Jb2U5/kI8EVJL9ehj2uBWRExRdLTwPuBe4CJNV5Td0RcAJwB/G1+besBO0l6Ic/2AnBl/nmViFgV2JnKmz8i5gCbAUuB3YBj8mv/d9KBoRu4EPiSpOU5nE4mbdPngNNqvL7+1j8H2FjSrIjYgNRZOBD4V2AN4OuSTsrztgOfBQ4mHcxuJm3/p/opdyfgMknr5cefA44iHeQeBw6XdHM/yw203/4IuE7SWZX57wZOkPT9iHgz6T2xDbCYtK2/k+e7iN77x+75ANzvuvIys3M7TABOBz5BPmtsoC3+G/jbiFhf0h/6b/3W5x56c9wBrBkRb8lv5r2By/rM8wIp9CcDHwQOi4g9ACQdIWlC7mXuADwN/CCHy9Wk0H498F3gI5Uy20lBsj4pcF8EzqIBEbEe8AHgofx4OjAPOJo0NHEDKcBXlbQ/8Edy717SKbmYH5IONNOAO0m9oKp9gZNIAXwmabjjPZXnZzFAT5gUoNcA++THs0kHylqvaRxwEGk46UFgF+DGSpg3YhOgu5+D4u6kg8Bk0uu8GHgJ2Jh0pvPevG5IofKhPH1bYK9BrL8/O5CGS/4BOD4i3pKnHwXsQQrEdUj7zdn1CouIAI4AtpM0EXgf8GiN2Wvut8Bcep+ZbkraF6+PiNcBP87zTMvznRMRb62UXd0/5g+0rlz2OcB+wN8Ak0gdjx4DtkXuRDxEGgYtlnvozdPTS78V+C3wWPVJSbdUHt4TEfNIO9/VPRMjojM/PlLSXRHxLmAV4HRJK4ArI+KYSpldwPcqy58E/LROPa+OiBWkXs5PgC/l6XsD10v6cS7rVNJY9TuAW/opB0kXVNY9B3g6IiZJejZP/oGkn+e/l0bExaQQ/2FEvJ4UJIfXqe8lwNciYi6pvQ4APtVnnrdHxDOkgH0I+LCkZyNiLeB/KnXcKr+WdmCRpP7GlCcDS/qZfrukq3M5a5IOhpMlvQi8EBFfJ43Rngd8jLTNFub5vwLsVOd1DuSEvJ67cw94S+AB0lnEET0Hn7wN/hgR+/ecBdWwHBgPbBoRiwf6jKfOfnsV8B+VXu9+wPclLctB/KikC/Oyd0bE90gHt/vytF77B733s77r2gu4VtL8/FqPJ4V4j0baYglp+xbLgd48lwI/Azakn15kRGwPfJV06r4q6Q313crzq5B6gHMlfTtPXgd4LId5jz9UllkD+DppKGJKnjwxIjokLa9Rzz3yKeq7Sb2ntYBn8rpeLjsPXyykdy+o+no6SL2rj5J69N35qbVIp8sAC/ssdhnwQERMIIXebZIW1ahnTz3m5wPdcaTT+xdTB7OXOyTt0M/iXaTeXE9ZC4DJ+YqSWleSPE3/QzrV17I+6UC7qFKX9so86/SZf7in+E9U/v4/0sG4px5XRUR35fnlpA++e3UoqiQ9FBFHA3OAt+aht2MkPd533oH2W0lLIuJ60hnUyfn3Jyt12z4faHuMo/cZWa/9o857pFebSvq/iOiqLN5IW0wk7evF8pBLk+QeyiOkMdXv9zPLXNLwwRslTQLOBdoqz59J6kEcV5m2CFg3Iqrzvany92dIp+LbS1oTeFeeXp2/Vn1vBS4CTs2THie9KQDI63wjr7wZ+t6Wc1/SMMQupNPfDfpZd69lJD0G3A58GNifgYdbqi4jvdaawy013Ay8N5/+N+pBoC0i+h7Iqq9lIbAMWEvS5PyzpqSe4YRFpLbrUd1mzbQQ+EClDpMlrZbbeUCS5uaD4Pqk13ZyjVnr7bfzgJkR8Xekz1J6zhAXArf2qdsESYdVlu27Tw20rkWkz0MAiIjVgamNtkUejtuY9FlPsRzozfUJYOcaY7YTgackLY2It5ECEYCIOIR0armvpGoP43bSMMJRETEuIvYE3tanzBeBZ/IQxpcYnNOB9+ShiO8AH4yIf8hnC58hhdYv8rz/S/qgsbruZaRe8BpAo5fbXUL68Gpz0il7I84gjb3/rMH5q+taROq5bRYRHRGxGmlcu1+S/grcRNoeteZZBPwXcFpErBkR7RGxUT7rgdSWR0XEevmD688Pst6NOhc4KSLWhzRkFxG711sokp3zh5BLSftQrTO6mvttdgPpoHAicEVl/70OmB4R+0fEKvlnu8r4/2DXdSWwa0S8I3+2dAK9Dyz12uJtpCGgYj8QBQd6U0l6WNL/1Hj6cODEiFgCHE960/eYSQrLxyPi+fxzrKS/kK6fPpA0FLA3vXv/p5N6RU+SPpj90SDru5gUel+UJNL49pm5vF1JH4L+Jc/+FeC4iHgm0tUql5CGEh4D7s/rb8RV5NPjRj+slPSUpJv7DD01stxS4O9z/a4nXXEiYDvSkE8t55HOIAYymzQscD9p21zJK8M73wRuJPUG76T/M7Zm+AapR/tfeb+6g3TFUz3jSUMbT5KGc6YBx9aYd6D9FknLSK9vF1IPu2f6EtIHxfuQzv6eIJ0FDPTlnprryldPHUm65HQR6Wz2z6ROBdRvi/1IoV+0Nv+DCxttEfEwcIhG6UtKQxER88kfTo91XezV8ucwzwCbSHqkzrzTSBcrzMgH+WI50G1URbr2/GRgep/hJbMBRfoG782koZbTSD3wrQd75lYyX+VioyYibgE2BfZ3mNsQ7E76IL2NdDnqPg7z3txDNzMrhD8UNTMrxFgOuYwnXW2wiNqXTJmZWW8dpCuqfsUrV/kAYxvo2wG3jeH6zcxa2Y6ke+C8bCwDfRHA00+/QHf3yj2OP3XqBLq66t4y3Brgtmwut2dztUJ7tre3MWXK6yBnaNVYBvpygO7uFSt9oAMtUcdW4bZsLrdnc7VQe75qqNofipqZFcKBbmZWCAe6mVkhHOhmZoVwoJuZFcKBbmZWCAe6mVkhfLdFM2tJE9dcndXGNz/COjv7+5eyQ7d02Ussee7FppZZiwO9jvFXzIUr5zHpr77dTFOs0uG2bKbXcHuuukoHv3n4yaaW+ePNduGnm/59U8u89rTdWdLUEmvzkIuZWSHcQ69j2d77whGH8Ozi0TrGlq2zc6Lbsoley+3Z2TmRYz/zg7GuxkrFPXQzs0I40M3MCuFANzMrhAPdzKwQDnQzs0I40M3MCuFANzMrhAPdzKwQDnQzs0I
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"wankels.mpg.plot(kind='bar')\n",
"plt.xticks(np.arange(4),wankels.car_name)\n",
"pd.Series([merged['mpg'].mean() for i in range(len(wankels))]).plot(kind='line',color='red')\n",
"plt.title('Mazda Rotary MPG (red line is average)');"
]
},
{
"cell_type": "markdown",
"id": "4f793604-5c51-44b7-9cb6-301151304400",
"metadata": {},
"source": [
"## Diesels"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
"execution_count": 9,
2022-08-01 09:32:07 -04:00
"id": "c7aece22-1e78-4a48-b969-f1207ba09aad",
2022-07-21 16:31:53 -04:00
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:51.461824Z",
"iopub.status.busy": "2022-08-01T14:48:51.461525Z",
"iopub.status.idle": "2022-08-01T14:48:51.483317Z",
"shell.execute_reply": "2022-08-01T14:48:51.482548Z",
"shell.execute_reply.started": "2022-08-01T14:48:51.461798Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mpg</th>\n",
" <th>cylinders</th>\n",
" <th>displacement</th>\n",
" <th>horsepower</th>\n",
" <th>weight</th>\n",
" <th>acceleration</th>\n",
" <th>model_year</th>\n",
" <th>origin</th>\n",
" <th>car_name</th>\n",
" <th>efficiency</th>\n",
" <th>load</th>\n",
" <th>bore_size</th>\n",
" <th>grunt</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>244</th>\n",
" <td>43.1</td>\n",
" <td>4</td>\n",
" <td>90.0</td>\n",
" <td>48.0</td>\n",
" <td>1985.0</td>\n",
" <td>21.5</td>\n",
2022-07-21 16:31:53 -04:00
" <td>78</td>\n",
" <td>2</td>\n",
2022-08-01 09:32:07 -04:00
" <td>volkswagen rabbit custom diesel</td>\n",
" <td>0.533333</td>\n",
" <td>0.045340</td>\n",
" <td>22.500000</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.137778</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>325</th>\n",
" <td>44.3</td>\n",
" <td>4</td>\n",
" <td>90.0</td>\n",
" <td>48.0</td>\n",
" <td>2085.0</td>\n",
" <td>21.7</td>\n",
" <td>80</td>\n",
" <td>2</td>\n",
" <td>vw rabbit c (diesel)</td>\n",
" <td>0.533333</td>\n",
" <td>0.043165</td>\n",
" <td>22.500000</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.137778</td>\n",
2022-08-01 09:32:07 -04:00
" </tr>\n",
" <tr>\n",
" <th>326</th>\n",
" <td>43.4</td>\n",
" <td>4</td>\n",
" <td>90.0</td>\n",
" <td>48.0</td>\n",
" <td>2335.0</td>\n",
" <td>23.7</td>\n",
" <td>80</td>\n",
2022-07-21 16:31:53 -04:00
" <td>2</td>\n",
2022-08-01 09:32:07 -04:00
" <td>vw dasher (diesel)</td>\n",
" <td>0.533333</td>\n",
" <td>0.038544</td>\n",
" <td>22.500000</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.137778</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
" <th>327</th>\n",
" <td>36.4</td>\n",
" <td>5</td>\n",
" <td>121.0</td>\n",
" <td>67.0</td>\n",
" <td>2950.0</td>\n",
" <td>19.9</td>\n",
" <td>80</td>\n",
" <td>2</td>\n",
" <td>audi 5000s (diesel)</td>\n",
" <td>0.553719</td>\n",
" <td>0.041017</td>\n",
2022-08-01 09:32:07 -04:00
" <td>24.200000</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.533024</td>\n",
2022-08-01 09:32:07 -04:00
" </tr>\n",
" <tr>\n",
" <th>358</th>\n",
" <td>28.1</td>\n",
" <td>4</td>\n",
" <td>141.0</td>\n",
" <td>80.0</td>\n",
" <td>3230.0</td>\n",
" <td>20.4</td>\n",
" <td>81</td>\n",
" <td>2</td>\n",
" <td>peugeot 505s turbo diesel</td>\n",
" <td>0.567376</td>\n",
" <td>0.043653</td>\n",
" <td>35.250000</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.287662</td>\n",
2022-08-01 09:32:07 -04:00
" </tr>\n",
" <tr>\n",
" <th>359</th>\n",
" <td>30.7</td>\n",
" <td>6</td>\n",
" <td>145.0</td>\n",
" <td>76.0</td>\n",
" <td>3160.0</td>\n",
" <td>19.6</td>\n",
" <td>81</td>\n",
" <td>2</td>\n",
" <td>volvo diesel</td>\n",
" <td>0.524138</td>\n",
" <td>0.045886</td>\n",
" <td>24.166667</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.648323</td>\n",
2022-08-01 09:32:07 -04:00
" </tr>\n",
" <tr>\n",
" <th>386</th>\n",
" <td>38.0</td>\n",
" <td>6</td>\n",
" <td>262.0</td>\n",
" <td>85.0</td>\n",
" <td>3015.0</td>\n",
" <td>17.0</td>\n",
" <td>82</td>\n",
" <td>1</td>\n",
" <td>oldsmobile cutlass ciera (diesel)</td>\n",
" <td>0.324427</td>\n",
" <td>0.086899</td>\n",
" <td>43.666667</td>\n",
2022-08-01 11:39:46 -04:00
" <td>0.631519</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" mpg cylinders displacement horsepower weight acceleration \\\n",
2022-08-01 09:32:07 -04:00
"244 43.1 4 90.0 48.0 1985.0 21.5 \n",
"325 44.3 4 90.0 48.0 2085.0 21.7 \n",
"326 43.4 4 90.0 48.0 2335.0 23.7 \n",
2022-07-21 16:31:53 -04:00
"327 36.4 5 121.0 67.0 2950.0 19.9 \n",
2022-08-01 09:32:07 -04:00
"358 28.1 4 141.0 80.0 3230.0 20.4 \n",
"359 30.7 6 145.0 76.0 3160.0 19.6 \n",
"386 38.0 6 262.0 85.0 3015.0 17.0 \n",
2022-07-21 16:31:53 -04:00
"\n",
2022-08-01 09:32:07 -04:00
" model_year origin car_name efficiency \\\n",
"244 78 2 volkswagen rabbit custom diesel 0.533333 \n",
"325 80 2 vw rabbit c (diesel) 0.533333 \n",
"326 80 2 vw dasher (diesel) 0.533333 \n",
"327 80 2 audi 5000s (diesel) 0.553719 \n",
"358 81 2 peugeot 505s turbo diesel 0.567376 \n",
"359 81 2 volvo diesel 0.524138 \n",
"386 82 1 oldsmobile cutlass ciera (diesel) 0.324427 \n",
2022-07-21 16:31:53 -04:00
"\n",
2022-08-01 11:39:46 -04:00
" load bore_size grunt \n",
"244 0.045340 22.500000 1.137778 \n",
"325 0.043165 22.500000 1.137778 \n",
"326 0.038544 22.500000 1.137778 \n",
"327 0.041017 24.200000 1.533024 \n",
"358 0.043653 35.250000 1.287662 \n",
"359 0.045886 24.166667 1.648323 \n",
"386 0.086899 43.666667 0.631519 "
2022-07-21 16:31:53 -04:00
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
2022-08-01 09:32:07 -04:00
"diesels = merged[merged.car_name.str.contains('diesel')]\n",
"diesels"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "markdown",
2022-08-01 09:32:07 -04:00
"id": "79979a1e-de58-4610-8878-6f374f500d1c",
2022-07-21 16:31:53 -04:00
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"All of the diesels get higher than average MPG"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
"execution_count": 10,
2022-08-01 09:32:07 -04:00
"id": "92f0bf1a-af7b-4a26-b422-8ca01fdfde1b",
2022-07-21 16:31:53 -04:00
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:51.484935Z",
"iopub.status.busy": "2022-08-01T14:48:51.484399Z",
"iopub.status.idle": "2022-08-01T14:48:51.694203Z",
"shell.execute_reply": "2022-08-01T14:48:51.693458Z",
"shell.execute_reply.started": "2022-08-01T14:48:51.484907Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
2022-08-01 09:32:07 -04:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAhwAAAELCAYAAACSxV/CAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAA2XElEQVR4nO3deZxcVZ3+8U8nrEIIGCIKCAnCPKIgmwEVgSAoigiOMqggi6CguLLoIAwuI0QQUH6KjqBIAAeRERWQRWQJi4Lsm8ijsq8SEEnCnqR/f5zTUmmqOp3QN92dPO/Xi1fq3nPvOd86VfT91jnnVnV1d3cTERER0aQRgx1ARERELPyScERERETjknBERERE45JwREREROOScERERETjknBERERE45JwREQjJP1Q0qEN1j9OUrekxZpqY35JGivJkpYaoPr2kHRlh7I5+kHS+ZJ2H4h2e7VzsKQfD3S9w4Gk7SWdPthxDHdD7n/UiBj6JN0DrATMBGYBtwOnACfYng1g+5ODFiD/inFlYGXbj7XsvwlYDxhv+x5Jk4Gdgefrf9cDn7V9Rz1+LeAbwFbAksDfgQuAI20/0KH5g4CTbD878M+sb7bf01C9k5qodziwfbakSZLeZPuWwY5nuMoIR0TMr/fZHgWsDhwB/Cdw4uCG9BJ3Ax/p2ZC0LrB0m+O+ZXtZYFXgUWByPX5N4I/AQ8AGtpcDNgXuBN7erkFJSwK7Az/tUN4lKX97B9ACGuX6GbD3AmhnoZURjoh4WWw/CZwt6RHgaknH2L6tjhw8YPu/ACRtBxwGjKOMiHyy59OipP8EPgcsR7m472v74nph/hLwCWB54OJ63j/6Gd6pwG7A9+r27pSRmMM6PJenJZ0G/Lzu+hrwe9v7txzzKHBsH21uAvyzdfRD0hTg98BEYENg3XqR/B6wETAVONT2GfX4McBJ9fg7gN/27+n+q62f2v6xpD2AjwNXA3sB/6T07fn12NHAt4Ftgdm1za/antWm3q8Ba9r+aJ0q+jHwHmAk8FdgO9t/b3PeQZTX71XA/cAhtn9VE7O/A2+3fVs9dixwH7C67Ufn8p65B/gfYJeyqWWAA9u1VY8fCXyL8h6YDhxD6f/Fbc/sR19MoSSRn5nbaxDtJcuOiAFh+xrgAWCz3mWSNgR+AuwDjAGOpyQpS0oS5Y/4hDpisg1wTz31c8D7gS0o0yNPAN+fh7CuBpaTtHa94HyIDiMPNc5lKRewG+uurYEz56E9gHUBt9m/K+UT8ihKgvE74DTKxfEjwA8kvbEe+33gWeA1wJ71v/m1SY1nRcoF90RJXbXsZMq02JrABsC7KAnK3OwOjAZeS3k9Pwk80+HYOynvidHA14GfSnqN7eeAX9IyAgXsBFxWk42O75mW4z8CvBdY3vbMTm3VYz9BSZDWpyR97+8V59z64s/AOEnL9dUx0VlGOCJiID0EvLLN/k8Ax9v+Y90+WdLBwFuABylrI94gaarte1rO2wf4TM9oQf2UfZ+kXechpp5RjssoowUPtjnmQEmfoVzkrwH2qPtXBB7pOagecxjlb+fPbH+iTV3LUz5B9zbZ9p9qPe8G7rF9Ui27QdKZwI6S7gA+CKxr+yngNkknA5v3+xnP6V7bP6rtngz8AFhJUjflAry87WeApyR9h5IUHT+XOl+gJAFr1hGH6zsdaPv/WjZ/LunLwMbAWZSE6wTgkFq+c0vbfb1nLqv7vmv7/n62tRPw/1reS0dQ1uUgaaV+9EXPa7o8MK1z10QnSTgiYiCtArSb7lgd2F3SZ1v2LUFZ0HmZpC9Qpi/eKOm3wP62H6rn/UrS7JbzZlEWrPbXqcDlwHjKdEo7R/dM/fTyOGWUAQDbxwHHSTqMst6jnScooxi93d/yeHVgE0n/bNm3WI11bH3cevy9Hdrqj38lTHXKCGBZSmK4OPBw3Qdl1Pv+3hW0cSpldON0SctTRo0Osf1C7wMl7QbsT5kW6Wl7xfr4EmBpSZvUONcHflXLOr5nWrbniHUuba3c6/jer8fc+qLnNf1n7+cY/ZOEIyIGhKQJlISj3e2b9wOH2z683bm2TwNOq8PVxwNHUqYg7gf2tP37Nu2N609ctu+VdDdlbn6v/pzT4mLgA5T5/P66Bdivzf7Wn+a+nzJ18M7eB9Wpn5mUC/oddfdq89B+f90PPAesWKcj+q0mFl8Hvl5fh/Mo0zZzLBqWtDrwI8pIwlW2Z9W7hLpqPbMlnUGZGvk78BvbPSMJfb5nqn/16dzaAh5mziTxtS2P+9MXa1NGpTK6MZ+ScETEy1KThM2B/0dZrHhrm8N+RBmpuIgyZfEKyoLIyymfPFehLKp8lrIWoGd92Q+BwyXtXhOHscDbbJ81j2HuBaxg+6l5vKPha8A1kr4NHGP7QUkrUi4+7aZNoDy/5SWtYrvd9A3Ab4Aj6tRQz/c7rA/MsP1nSb8EviZpT8qn9d15cV3LgLD9sKQLgWNUvi9lBmUUaFXbl/V1rqQtgccoCzmnUaZYXrLQFFiGkhRMred9DFin1zGnAb+mjCYd0rK/43umJSmZl7bOAD4v6VzgKcpdVUC/+2IL4Py2HRL9kkWjETG/zpE0nXo3AGWF/8faHWj7Osqc/HGUKYe/8eI6iSUpt9U+RhlWfxVwcC37f8DZwIW1raspiyDnie07awzzet5fKGsGVgVurjH8nrJWpe2Xmtl+nnJb7Uf7qHc6ZVHih2tdj1BGdXoWRH6GMh3wSK1rXkZY5sVulGmK2ymvyy9omULqw6vrsdMoiykvo81iXNu3U+4GuYoygrEupf9aj/kjJQFYmZYL+lzeMy/Rj7Z+BFxIGYG6kTIq0/M9MjD3vvgIc1/bEn3o6u7unvtRERHRb3Uk5grKd3d0unsjBpGk9wA/tL16P459H7Cr7Z2aj2zhlYQjIiIWepKWBrakjHKsRLnd+WrbXxjMuBYlmVKJiIhFQRdloesTlCmVPwNfGdSIFjEZ4YiIiIjGZYQjIiIiGpfbYiPaWxKYQLl3v93tfhERMaeRlDt7rqV8r8kcknBEtDeBcpdBRETMm81o8wWASTgi2nsY4IknnmL27KxzamfMmGV5/PEZgx3GkNV0/4zau3zlyfQTmvqKjmbl/dPZcO2bESO6WGGFZaD+/ewtCUdEe7MAZs/uTsLRh/RN3xrtn0ceab6Nhg3n2Js2zPum7TR0Fo1GRERE45JwREREROOScERERETjknBERERE45JwRERERONyl0pEH8aMWXawQxjSxo4dNdghDGmN9s/iI5tvo2HDOfb+ePa5mUyflh8L7pGEI6IPex12IY8+kT8YMfRMuvMxAA4+4KxBjiQ6OeeYHZg+2EEMIZlSiYiIiMYl4YiIiIjGDVjCIWmipOs6lI2T9NhAtTWXOO6RtE6HsvMkva4+niJpuwUQzzhJe/c3xgUQz7/6oOF2filpQoeyyZI+Ux//t6QPNRRDt6Rl6+MpksY30U5ERMzdIrWGw/a2g9DsOGBv4IRBaPsl5qcPJC1me+Y8HL8JsIzta/sRz1fmNZ75dCzwNWD3BdReRES0mOeEQ9K7gW9SfoZ2KrCP7b+1Oe7TwH6UH3GZ0rL/VcBpwEp110W295O0B7Az8E/gTcCDwGeBo4C1KD93+1Hb3ZJWAn4IvA7oAo6yfUpL87tI2hRYGTjW9nG17XuA7Wzf1ivW5YBv13aXAi4F9rf9ku+Dl7Qn8Pm6+TywHbA2cLTtN9djJrZsfx8YL+km4G+2d+xV3wHAhymvxbPAp2zfJOkVwMnAG4EXANveSZKAycArKK/BZNtHt4nzrbXvepaBf9H2ha19IOk1wPeA1YClgZ/ZntTSVycC7wDuknQI8DNgudpH59r+Uu92q70pr3FPLKsApwArAnfT8r6TNBm4zvZxkpYADge2AJYAbq39MaOOEu1H+cnjEcBOtu+o/XFsrXsJyuvd7teszgVOkDTKdtZxRUQsYPOUcNRk4VRgC9u3S9oL+F9gk17HvQk4BNjA9t8l/aCleBfgXttb12NXaCmbAKxr+wFJv6FctLYAngJuALYCLgK+C9xm+9/rRfMGSTe0JBIr2d68JiY
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"diesels.mpg.plot(kind='barh')\n",
"plt.yticks(np.arange(len(diesels)),diesels.car_name)\n",
"plt.axvline(merged.mpg.mean(),color='red')\n",
"plt.title('Diesel MPG (red line is average)');"
]
},
{
"cell_type": "markdown",
"id": "df9d8d17-46ec-4ce8-950e-aa1a24d98d7f",
"metadata": {},
"source": [
"# Interesting"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "0fb1ed64-bba6-463c-9a0f-84af360515b5",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:51.695599Z",
"iopub.status.busy": "2022-08-01T14:48:51.695257Z",
"iopub.status.idle": "2022-08-01T14:48:51.711949Z",
"shell.execute_reply": "2022-08-01T14:48:51.711236Z",
"shell.execute_reply.started": "2022-08-01T14:48:51.695573Z"
2022-08-01 09:32:07 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
2022-07-21 16:31:53 -04:00
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mpg</th>\n",
" <th>cylinders</th>\n",
" <th>displacement</th>\n",
" <th>horsepower</th>\n",
" <th>weight</th>\n",
" <th>acceleration</th>\n",
" <th>model_year</th>\n",
" <th>origin</th>\n",
" <th>car_name</th>\n",
" <th>efficiency</th>\n",
" <th>load</th>\n",
" <th>bore_size</th>\n",
" <th>grunt</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>386</th>\n",
" <td>38.0</td>\n",
" <td>6</td>\n",
" <td>262.0</td>\n",
" <td>85.0</td>\n",
" <td>3015.0</td>\n",
" <td>17.0</td>\n",
" <td>82</td>\n",
" <td>1</td>\n",
" <td>oldsmobile cutlass ciera (diesel)</td>\n",
" <td>0.324427</td>\n",
" <td>0.086899</td>\n",
" <td>43.666667</td>\n",
2022-08-01 11:39:46 -04:00
" <td>0.631519</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" mpg cylinders displacement horsepower weight acceleration \\\n",
"386 38.0 6 262.0 85.0 3015.0 17.0 \n",
"\n",
" model_year origin car_name efficiency \\\n",
"386 82 1 oldsmobile cutlass ciera (diesel) 0.324427 \n",
"\n",
2022-08-01 11:39:46 -04:00
" load bore_size grunt \n",
"386 0.086899 43.666667 0.631519 "
2022-07-21 16:31:53 -04:00
]
},
2022-08-01 09:32:07 -04:00
"execution_count": 11,
2022-07-21 16:31:53 -04:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"merged.iloc[np.where((merged.mpg>35) & (merged.displacement > 250))]"
]
},
{
"cell_type": "markdown",
2022-08-01 09:32:07 -04:00
"id": "1e1ec508-df30-42c6-a63f-aea36c12d2e8",
2022-07-21 16:31:53 -04:00
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"This is an interesting engine. In fact, [these cars are rumored to be the reason why diesel cars are so unpopular in North America](https://www.autotrader.com/car-news/when-diesel-was-dreadful-oldsmobile-diesels-259997). [Here is a more technical write-up](https://www.dieselworldmag.com/diesel-engines/oldsmobile-350-v8)"
]
},
{
"cell_type": "markdown",
"id": "b9858dee-1de0-46ab-b46d-baa4cafc0efc",
"metadata": {},
"source": [
"<hr>"
]
},
{
"cell_type": "markdown",
"id": "d8625227-6fca-4e92-ba0c-271bbea53c23",
"metadata": {},
"source": [
"Big lazy engines in big heavy cars don't have to have poor MPG!"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
2022-08-01 09:32:07 -04:00
"execution_count": 12,
2022-07-21 16:31:53 -04:00
"id": "c0c4f183-ef44-42ee-b64c-a75c63450d7b",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:51.713333Z",
"iopub.status.busy": "2022-08-01T14:48:51.712976Z",
"iopub.status.idle": "2022-08-01T14:48:51.733745Z",
"shell.execute_reply": "2022-08-01T14:48:51.733016Z",
"shell.execute_reply.started": "2022-08-01T14:48:51.713306Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mpg</th>\n",
" <th>cylinders</th>\n",
" <th>displacement</th>\n",
" <th>horsepower</th>\n",
" <th>weight</th>\n",
" <th>acceleration</th>\n",
" <th>model_year</th>\n",
" <th>origin</th>\n",
" <th>car_name</th>\n",
" <th>efficiency</th>\n",
" <th>load</th>\n",
" <th>bore_size</th>\n",
" <th>grunt</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>298</th>\n",
" <td>23.0</td>\n",
" <td>8</td>\n",
" <td>350.0</td>\n",
" <td>125.0</td>\n",
" <td>3900.0</td>\n",
" <td>17.4</td>\n",
" <td>79</td>\n",
" <td>1</td>\n",
" <td>cadillac eldorado</td>\n",
" <td>0.357143</td>\n",
" <td>0.089744</td>\n",
" <td>43.75</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.020408</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
" <th>363</th>\n",
" <td>26.6</td>\n",
" <td>8</td>\n",
" <td>350.0</td>\n",
" <td>105.0</td>\n",
" <td>3725.0</td>\n",
" <td>19.0</td>\n",
" <td>81</td>\n",
" <td>1</td>\n",
" <td>oldsmobile cutlass ls</td>\n",
" <td>0.300000</td>\n",
" <td>0.093960</td>\n",
" <td>43.75</td>\n",
2022-08-01 11:39:46 -04:00
" <td>0.720000</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" mpg cylinders displacement horsepower weight acceleration \\\n",
"298 23.0 8 350.0 125.0 3900.0 17.4 \n",
"363 26.6 8 350.0 105.0 3725.0 19.0 \n",
"\n",
" model_year origin car_name efficiency load \\\n",
"298 79 1 cadillac eldorado 0.357143 0.089744 \n",
"363 81 1 oldsmobile cutlass ls 0.300000 0.093960 \n",
"\n",
2022-08-01 09:32:07 -04:00
" bore_size grunt \n",
2022-08-01 11:39:46 -04:00
"298 43.75 1.020408 \n",
"363 43.75 0.720000 "
2022-07-21 16:31:53 -04:00
]
},
2022-08-01 09:32:07 -04:00
"execution_count": 12,
2022-07-21 16:31:53 -04:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"merged.iloc[np.where((merged.mpg>20) & (merged.displacement > 340))]"
]
},
{
"cell_type": "markdown",
2022-08-01 09:32:07 -04:00
"id": "2ccec1cb-db88-430c-a118-351da41a23c1",
2022-07-21 16:31:53 -04:00
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"But some still do"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
2022-08-01 09:32:07 -04:00
"execution_count": 13,
2022-07-21 16:31:53 -04:00
"id": "8f51f87e-fb76-4c8a-b4bc-05f147fc8efa",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:51.735142Z",
"iopub.status.busy": "2022-08-01T14:48:51.734785Z",
"iopub.status.idle": "2022-08-01T14:48:51.750660Z",
"shell.execute_reply": "2022-08-01T14:48:51.749934Z",
"shell.execute_reply.started": "2022-08-01T14:48:51.735114Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mpg</th>\n",
" <th>cylinders</th>\n",
" <th>displacement</th>\n",
" <th>horsepower</th>\n",
" <th>weight</th>\n",
" <th>acceleration</th>\n",
" <th>model_year</th>\n",
" <th>origin</th>\n",
" <th>car_name</th>\n",
" <th>efficiency</th>\n",
" <th>load</th>\n",
" <th>bore_size</th>\n",
" <th>grunt</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>14.0</td>\n",
" <td>8</td>\n",
" <td>455.0</td>\n",
" <td>225.0</td>\n",
" <td>3086.0</td>\n",
" <td>10.0</td>\n",
" <td>70</td>\n",
" <td>1</td>\n",
" <td>buick estate wagon (sw)</td>\n",
" <td>0.494505</td>\n",
" <td>0.14744</td>\n",
" <td>56.875</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.956285</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" mpg cylinders displacement horsepower weight acceleration \\\n",
"13 14.0 8 455.0 225.0 3086.0 10.0 \n",
"\n",
" model_year origin car_name efficiency load \\\n",
"13 70 1 buick estate wagon (sw) 0.494505 0.14744 \n",
"\n",
2022-08-01 09:32:07 -04:00
" bore_size grunt \n",
2022-08-01 11:39:46 -04:00
"13 56.875 1.956285 "
2022-07-21 16:31:53 -04:00
]
},
2022-08-01 09:32:07 -04:00
"execution_count": 13,
2022-07-21 16:31:53 -04:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"merged[merged.load>0.14]"
]
},
{
"cell_type": "markdown",
"id": "4415be3a-f8fb-47f1-b39d-2c60a3495a1d",
"metadata": {},
"source": [
"Big car, big engine, terrible MPG.. That weight is way off"
]
},
{
"cell_type": "code",
2022-08-01 09:32:07 -04:00
"execution_count": 14,
2022-07-21 16:31:53 -04:00
"id": "7d556866-da6d-48dd-b37a-e59c3155085d",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:51.752120Z",
"iopub.status.busy": "2022-08-01T14:48:51.751768Z",
"iopub.status.idle": "2022-08-01T14:48:51.757117Z",
"shell.execute_reply": "2022-08-01T14:48:51.756287Z",
"shell.execute_reply.started": "2022-08-01T14:48:51.752093Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [],
"source": [
"merged.at[13,'weight'] = 5000\n",
"merged['load'] = merged.displacement / merged.weight"
]
},
{
"cell_type": "markdown",
2022-08-01 09:32:07 -04:00
"id": "146d6761-455a-407f-b627-24c13586a88f",
2022-07-21 16:31:53 -04:00
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"## What vehicles have the Highest MPG?"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
2022-08-01 09:32:07 -04:00
"execution_count": 15,
"id": "558d450a-2649-4005-bbe8-5f8cc509f965",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:51.758801Z",
"iopub.status.busy": "2022-08-01T14:48:51.758192Z",
"iopub.status.idle": "2022-08-01T14:48:52.022488Z",
"shell.execute_reply": "2022-08-01T14:48:52.021738Z",
"shell.execute_reply.started": "2022-08-01T14:48:51.758773Z"
2022-08-01 09:32:07 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAiYAAAFBCAYAAABD12Q5AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAABOe0lEQVR4nO3deXxU1f3/8dckhH1NJCJr2PwoRAEVFVEEUVGrpXax1A1rW610s2r99otatRW6qP36s9r2S+vXrXWJ2lZRq1YgCgoKyCKKH5RFEEEhLEkgAULy++PexCFMFmAmM0nez8eDR+aec++5n3syOp+cc+6dSEVFBSIiIiKpIC3ZAYiIiIhUUmIiIiIiKUOJiYiIiKQMJSYiIiKSMpSYiIiISMpQYiIiIiIpQ4mJiEiCmNkbZjYsTm3lmFmFmbWooX6NmZ0Zvp5sZn+Nx3mrneM0M/N4t9sYmNnhZrbczFolO5amLuYbXEQkFjMrjtpsC+wC9obbV7v73+NwjouAa4GhwNvuPrpa/VDgAeBoYDnwHXdfXENbDwETgfHu/lxU+T3AT4Bvu/tDZnZF2GYJUA6sAm529+fD/TsAtwNfBboCBcDbwO/c/e0azn0BUOTuiw6oA+LA3acmqN3ZgCWi7VTn7p+Z2SzgKuAPyY6nKdOIiYjUm7u3r/wHrAUuiCo75KQktAW4B/hN9Qozawk8C/wN6AI8DDwbltdkBUFyUtlGC+AbwMpq+80Nr6szQZKSZ2aZ4V/IM4FjgPOBjgRJ0RPAebWc9/vAozVV1jTyIQfPzNITfIq/A1cn+BzNnv7DEJFDFn54/xa4KCzKA/7L3XeZ2WiCROKPwHVAMXBTTYmMu78atvndGNWjCf6/dY+7VwD3mtkNwBnASzWENx241My6uPtW4BxgKdChhvOXm9n/AfcC/QhGbnoCo919R7jbDuDp8N9+wkTpDKI+xMzsNiAXKAW+DFxnZk8BvydIcMqBB4Fb3X1v+CH7W+AKoBC4u4bri3X+24AB7n6pmeUAq8N2fkUw0vU/7j4l3DcNuBH4HkFSNgP4vrtvidHuaOBv7t4z3P4v4McEydqnwCR3nxHjuC8BdwD9ge3AA+5+W1j3EvC8u98Xtf8S4HZ3/4eZHUUwQnE8sAm4xd3zwv0eIhjl6gOcDowP34sxzxUec3nYD+0JEuDvAN9191fr0RdvAf3MrI+7fxy79+VQacREROLhJuBkgg/xIcCJwM1R9d2Aw4AeBKMX08zsYKYEBgNLw6Sk0tKwvCalwHPAhHD7cuCRmnYORzK+S5BAfQicCbwclZTUx0Cg3N0/qVY+niCZ6Uzw1/fDQBkwABgGnB2eG4IPx/PD8hOArx/A+WM5lWAaZizwCzM7Oiz/MfAVgg/27sBW4P66Ggt/fz8Ehrt7B2AcsKaG3XcQ9Htn4EvANWb2lbDuMeBbUe0OIkg0XjCzdsB/wn2yw/3+aGbRv++LgSkEieac2s4Vtv1H4BLgCKATwXuyUq194e5lwEcE73FJECUmIhIPlwC/dPfP3X0TwXqMy6rtc4u773L314AX+GJ05UC0J/grONp2ahj9iPIIcLmZdSL40PlXjH1ONrNtwEaCD8AL3X07QUK1sXInMxtqZtvMrLCWhaCdgaIY5XPd/V/uXk4wynAucK2773D3z4H/4YsE6iKCkaF14V/sv67jGutyu7uXuPsSYAlffLheTTCC9Ym77wJuA75ej6mmvUArYJCZZbj7GnevPj0GgLvnu/u77l7u7kuBxwl+DwD/BIaaWZ9w+xLgH2Es5wNr3P1Bdy9z93eAZ9g3SXvW3d8I2y6t41xfB6a7+xx33w38AohOcuvTF0UEv19JEE3liEg8dAeih7Y/Dssqba024lC9vr6KCT7Qo3UkdhJQxd3nmFlXglGc5929JMaAzTx3PzXG4QUEf11XtrUY6BzeAVPTnS9biZ0srYt63QfIADZExZIWtU/3avsf6tTBxqjXOwmSvMo4/mlm5VH1e4HDgfU1NebuH5nZtQQf3oPN7GXgOnf/tPq+ZnYSwZqhXKAlQULzVNhOkZm9QJCQ/Tb8eVVUbCeFCWOlFuy7die6j2o9F9X61N13mllB1OH16YsOQHQ8EmdKTEQkHj4l+J/6e+F277CsUhczaxeVnPQGlh3Eed4DrjezSNR0zrHUY+qBYJ3LL4AxB3jOGcDt1eKvy4dAxMx6uHv0h3v0X+frCO5qOiycIqhuA9Ararv3gQR9ANYBV7r7Gwd6oLs/BjxmZh2B/yVILKqPlEEwFXMfcK67l4Z3RR0WVf84cKuZvQ60AWZFxfaau59VSxgV1bZrO9cGou4qMrM2QFbUsbX2RThyMoBgxEkSRFM5IhIPjwM3m1lXMzuMIAH4W7V9bjezlmZ2GsEQ/VPVG4Hgzgoza03wh1OambU2s4ywOp/gL9gfm1krM/thWD6zHjHeC5wFvH4gF0YwDbSB4C/p3Kj4TqjpAHffA7zKF1MIsfbZALwC3G1mHc0szcz6m1nlMXkE19nTzLoAPz/AuOvrz8CUyqmU8Hc4vq6DLHBGuNi0lGAR6t4adu8AbAkThRMJ1oVEe5Egsf0l8GQ41QXwPHCkmV1mZhnhv+FR62MO9FxPAxeY2SnhAuXbgUhUfV19cSLB1JIWviaQEhMRiYc7gAUEC1HfBd4JyyptJJje+JRg0ef33f2DGtq6jOBD7k/AaeHrvwCE6wK+QrC4cRtwJfCVsLxW7r7F3WdUWzhbJ3cvJRhleZ9gbUwh4MBwal8n87/EHj2IdjnBdMP7BP3zNF9MG/0FeJngr/N3gH8cSNwH4P8RLA5+xcyKgHnASfU4rhXBlMlmgt9vNjC5hn0nAb8M2/8FQdJVJVzP8Q+ChcaPRZUXESwInkDw3tlIMCpT20POajyXu78H/IjgVu8NBFOAnxOMXEHdfXEJQfIiCRSpqDig/0ZFRA5I9VtMmxMzmwP8KBkPWZO6mVl7ggR3oLuvrmPfbOA1YFiYrEqCaI2JiEiC1LCYVpLIgifyziCYwrmLYIRvTV3HhXdN1TaFJHGiqRwREWlOxhNMC31K8LyZCQc6vSeJpakcERERSRkaMRGJrQWQg6Y7RUTqKy7/39T/dEVi60Pw6OnTgOqPFRcRkf31BGYTPOsl5lOA60OJiUhslbdszk5qFCIijc8RKDERibsNAFu37qC8XOuwALKy2lNQUJzsMFJGIvujw1XfBqBo2oMJaT9R9B7ZV3Prj7S0CF26tIPw/58HS4mJSGx7AcrLK5SYRFFf7Cth/bFxY2LbT6DGGHMiNdP+qOkJwPWixa8iIiKSMpSYiIiISMpQYiIiIiIpQ4mJiIiIpAwlJiIiIpIylJiIiIhIylBiIiIiIilDiYmIiIikDCUmIiIikjL05FeRWmRltU92CCmla9cOyQ4hpSSsPzLSE9t+AjXGmBOpsfVH6a4yigpLqrbLysrIy3uUbdu2kps7hFGjxrJrVymzZ89izZpVlJeX07VrNhde+M2Y7d1+++29gPuAM4Ey4Plbb731ktpiUGIiUovv3PEKn28tqXtHkTiaunIzAJOvfzbJkUhzM/3u8RRFbS9YMI/i4qJ99pk58xXWrFnJscceR5cumWzc+GnMtm6//fYI8E9gEPA7gu/QObquGJSYiIiIyH42b97EkiULOfHEU5g7N/ii9e3bt7F69UcMHHgUJ598KpFIhEGDjqmpiTHA8cAU4DfArltvvbXOLw/SGhMRERHZR0VFBfn5r5CbO5Ts7G5V5Vu3bgFg06bP+Mtf/sBf/vIH5s59vaZmBoU/vwbsBApvv/32H9d1biUmIiIiso/ly5dRVFSI2SB27CgGYPfu3ezZsweAPXv2cNZZX6Jbt+4sWrSAdes+jtVMq/DnHuBCYDVwz+23335kbedWYiIiIiL7KC4uoqSkhLy8R3n11X8DsGLFct57bwkARxzRg/79BzJggAFQWLgdCBbLbtq0KSNsZk3484Vbb731WeAFIAL0re3cWmMiIiIi+xgwwMjKOgyALVsKmD9/Lr1753DiiSOZOfNl1q9fy/vvL+WDD5YRiUTo1q07AFOmTCE
"text/plain": [
"<Figure size 432x360 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"top_mpg = merged.sort_values('mpg').tail(10)\n",
"\n",
"fig, ax = plt.subplots(figsize = (6,5))\n",
"ax.barh(top_mpg.car_name,top_mpg.mpg)\n",
"for i in ax.patches:\n",
" plt.text(i.get_width()+0.2, i.get_y()+0.5,\n",
" str(round((i.get_width()), 2)),\n",
" fontsize = 10, fontweight ='bold',\n",
" color ='grey')\n",
"ax.set_title('Top 10 MPG (red line is average)')\n",
"plt.axvline(merged.mpg.mean(),color='red')\n",
"plt.show();"
]
},
{
"cell_type": "markdown",
"id": "260484cb-5145-4c0f-8952-8f6ba652c8a5",
"metadata": {},
"source": [
"In more detail:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "38935e91-3877-47a3-96d6-cd54e2704bdb",
2022-07-21 16:31:53 -04:00
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:52.023871Z",
"iopub.status.busy": "2022-08-01T14:48:52.023532Z",
"iopub.status.idle": "2022-08-01T14:48:52.046271Z",
"shell.execute_reply": "2022-08-01T14:48:52.045474Z",
"shell.execute_reply.started": "2022-08-01T14:48:52.023845Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mpg</th>\n",
" <th>cylinders</th>\n",
" <th>displacement</th>\n",
" <th>horsepower</th>\n",
" <th>weight</th>\n",
" <th>acceleration</th>\n",
" <th>model_year</th>\n",
" <th>origin</th>\n",
" <th>car_name</th>\n",
" <th>efficiency</th>\n",
" <th>load</th>\n",
" <th>bore_size</th>\n",
" <th>grunt</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>322</th>\n",
" <td>46.6</td>\n",
" <td>4</td>\n",
" <td>86.0</td>\n",
" <td>65.0</td>\n",
" <td>2110.0</td>\n",
" <td>17.9</td>\n",
" <td>80</td>\n",
" <td>3</td>\n",
" <td>mazda glc</td>\n",
" <td>0.755814</td>\n",
" <td>0.040758</td>\n",
" <td>21.50</td>\n",
2022-08-01 11:39:46 -04:00
" <td>2.285019</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>329</th>\n",
" <td>44.6</td>\n",
" <td>4</td>\n",
" <td>91.0</td>\n",
" <td>67.0</td>\n",
" <td>1850.0</td>\n",
" <td>13.8</td>\n",
" <td>80</td>\n",
" <td>3</td>\n",
" <td>honda civic 1500 gl</td>\n",
" <td>0.736264</td>\n",
" <td>0.049189</td>\n",
" <td>22.75</td>\n",
2022-08-01 11:39:46 -04:00
" <td>2.168337</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>325</th>\n",
" <td>44.3</td>\n",
" <td>4</td>\n",
" <td>90.0</td>\n",
" <td>48.0</td>\n",
" <td>2085.0</td>\n",
" <td>21.7</td>\n",
" <td>80</td>\n",
" <td>2</td>\n",
" <td>vw rabbit c (diesel)</td>\n",
" <td>0.533333</td>\n",
" <td>0.043165</td>\n",
" <td>22.50</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.137778</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>393</th>\n",
" <td>44.0</td>\n",
" <td>4</td>\n",
" <td>97.0</td>\n",
" <td>52.0</td>\n",
" <td>2130.0</td>\n",
" <td>24.6</td>\n",
" <td>82</td>\n",
" <td>2</td>\n",
" <td>vw pickup</td>\n",
" <td>0.536082</td>\n",
" <td>0.045540</td>\n",
" <td>24.25</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.149538</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>326</th>\n",
" <td>43.4</td>\n",
" <td>4</td>\n",
" <td>90.0</td>\n",
" <td>48.0</td>\n",
" <td>2335.0</td>\n",
" <td>23.7</td>\n",
" <td>80</td>\n",
" <td>2</td>\n",
" <td>vw dasher (diesel)</td>\n",
" <td>0.533333</td>\n",
" <td>0.038544</td>\n",
" <td>22.50</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.137778</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>244</th>\n",
" <td>43.1</td>\n",
" <td>4</td>\n",
" <td>90.0</td>\n",
" <td>48.0</td>\n",
" <td>1985.0</td>\n",
" <td>21.5</td>\n",
" <td>78</td>\n",
" <td>2</td>\n",
" <td>volkswagen rabbit custom diesel</td>\n",
" <td>0.533333</td>\n",
" <td>0.045340</td>\n",
" <td>22.50</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.137778</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>309</th>\n",
" <td>41.5</td>\n",
" <td>4</td>\n",
" <td>98.0</td>\n",
" <td>76.0</td>\n",
" <td>2144.0</td>\n",
" <td>14.7</td>\n",
" <td>80</td>\n",
" <td>2</td>\n",
" <td>vw rabbit</td>\n",
" <td>0.775510</td>\n",
" <td>0.045709</td>\n",
" <td>24.50</td>\n",
2022-08-01 11:39:46 -04:00
" <td>2.405664</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>330</th>\n",
" <td>40.9</td>\n",
" <td>4</td>\n",
" <td>85.0</td>\n",
" <td>53.5</td>\n",
" <td>1835.0</td>\n",
" <td>17.3</td>\n",
" <td>80</td>\n",
" <td>2</td>\n",
" <td>renault lecar deluxe</td>\n",
" <td>0.629412</td>\n",
" <td>0.046322</td>\n",
" <td>21.25</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.584637</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>324</th>\n",
" <td>40.8</td>\n",
" <td>4</td>\n",
" <td>85.0</td>\n",
" <td>65.0</td>\n",
" <td>2110.0</td>\n",
" <td>19.2</td>\n",
" <td>80</td>\n",
" <td>3</td>\n",
" <td>datsun 210</td>\n",
" <td>0.764706</td>\n",
" <td>0.040284</td>\n",
" <td>21.25</td>\n",
2022-08-01 11:39:46 -04:00
" <td>2.339100</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>247</th>\n",
" <td>39.4</td>\n",
" <td>4</td>\n",
" <td>85.0</td>\n",
" <td>70.0</td>\n",
" <td>2070.0</td>\n",
" <td>18.6</td>\n",
" <td>78</td>\n",
" <td>3</td>\n",
" <td>datsun b210 gx</td>\n",
" <td>0.823529</td>\n",
" <td>0.041063</td>\n",
" <td>21.25</td>\n",
2022-08-01 11:39:46 -04:00
" <td>2.712803</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" mpg cylinders displacement horsepower weight acceleration \\\n",
2022-08-01 09:32:07 -04:00
"322 46.6 4 86.0 65.0 2110.0 17.9 \n",
"329 44.6 4 91.0 67.0 1850.0 13.8 \n",
"325 44.3 4 90.0 48.0 2085.0 21.7 \n",
"393 44.0 4 97.0 52.0 2130.0 24.6 \n",
"326 43.4 4 90.0 48.0 2335.0 23.7 \n",
"244 43.1 4 90.0 48.0 1985.0 21.5 \n",
"309 41.5 4 98.0 76.0 2144.0 14.7 \n",
"330 40.9 4 85.0 53.5 1835.0 17.3 \n",
"324 40.8 4 85.0 65.0 2110.0 19.2 \n",
"247 39.4 4 85.0 70.0 2070.0 18.6 \n",
2022-07-21 16:31:53 -04:00
"\n",
2022-08-01 09:32:07 -04:00
" model_year origin car_name efficiency \\\n",
"322 80 3 mazda glc 0.755814 \n",
"329 80 3 honda civic 1500 gl 0.736264 \n",
"325 80 2 vw rabbit c (diesel) 0.533333 \n",
"393 82 2 vw pickup 0.536082 \n",
"326 80 2 vw dasher (diesel) 0.533333 \n",
"244 78 2 volkswagen rabbit custom diesel 0.533333 \n",
"309 80 2 vw rabbit 0.775510 \n",
"330 80 2 renault lecar deluxe 0.629412 \n",
"324 80 3 datsun 210 0.764706 \n",
"247 78 3 datsun b210 gx 0.823529 \n",
2022-07-21 16:31:53 -04:00
"\n",
2022-08-01 11:39:46 -04:00
" load bore_size grunt \n",
"322 0.040758 21.50 2.285019 \n",
"329 0.049189 22.75 2.168337 \n",
"325 0.043165 22.50 1.137778 \n",
"393 0.045540 24.25 1.149538 \n",
"326 0.038544 22.50 1.137778 \n",
"244 0.045340 22.50 1.137778 \n",
"309 0.045709 24.50 2.405664 \n",
"330 0.046322 21.25 1.584637 \n",
"324 0.040284 21.25 2.339100 \n",
"247 0.041063 21.25 2.712803 "
2022-07-21 16:31:53 -04:00
]
},
2022-08-01 09:32:07 -04:00
"execution_count": 16,
2022-07-21 16:31:53 -04:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
2022-08-01 09:32:07 -04:00
"merged.sort_values('mpg',ascending=False).head(10)"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "markdown",
2022-08-01 09:32:07 -04:00
"id": "15d5a2c5-cb01-4a54-8ce4-375018ebc79a",
2022-07-21 16:31:53 -04:00
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"## What vehicles have the lowest MPG?"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
2022-08-01 09:32:07 -04:00
"execution_count": 17,
"id": "65588fe3-762f-42b0-9427-feb64275b792",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:52.047810Z",
"iopub.status.busy": "2022-08-01T14:48:52.047441Z",
"iopub.status.idle": "2022-08-01T14:48:52.290693Z",
"shell.execute_reply": "2022-08-01T14:48:52.289858Z",
"shell.execute_reply.started": "2022-08-01T14:48:52.047782Z"
2022-08-01 09:32:07 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAfMAAAFBCAYAAABjDUY1AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAABHNUlEQVR4nO3de3gV1b3/8XcS0SqXgKAxagVK8IsVAbGituIlFWlRC7Q93hWttba0ttbaU3tqPZ7Wtp5jsT3W+rNY75darUcU74hya4CigkCRL6QQlZtAhIQkEE3C749ZgZ3N3kkI2dnZyef1PHmyZ9aamTUr+8l31mVmsnbu3ImIiIhkrux0F0BERET2jYK5iIhIhlMwFxERyXAK5iIiIhlOwVxERCTDKZiLiIhkOAVzEenwzOwQM3Mz+1Qr7e8KM5uTJK2fme00s/3C8ktmNqE1jht3nP8wsz+39n4zgZl9xcyeSHc52pP90l0AkY7MzEqAPKAW+AQoAr7t7h80Y9srgG+6+6kx6x4E1rj7Takob4Iy/BIYBxwD3Orut8SlXwz8BugDTAO+4e4fJdlXCXA4cLi7b45ZvwgYCvR395JwjhcDH4eft4Br3X15yD8Q+CXwReAA4EPgZeC/3X1NklO5EXjA3XfsVQW0Anf/cor2++tU7DcTuPtzZvZrMxvi7ovTXZ72QC1zkdQ7z927AflEgecPaS7P3igG/h14IT7BzI4F/gRcRnTBUgXc3cT+VgMXxezjOODABPn+J9TZkcBG4MGQvwCYD6wDjnf3HsAXgH8BpybYD2Z2ADABeDRJepaZ6X9hK6rvlUixvwDfaoPjZAS1zEXaiLvvMLO/Ab+vX2dmuUTB/ctEwfBe4NeAAfcAXcysAqghCqqXADvN7DrgDXc/z8yOAf4fMAxYC/zU3Z8L+38w7Lc/MBJ4B/gaUUt1AtHFxUXuvjBJmR8K+7kkQfIlwFR3nxXy/Bx418y6u/u2JNXwCHA5uy9oJgAPA7cmOX6VmT0O/DWsugX4u7tfH5NnIzF1msBJwNbYVruZzQD+DpwBDAeOCwHoD8AJwCbg5+7+ZMjfG3gg5F8OvNLI8RoIx3rU3f9c39sCzAOuArYCE939pZA3F7gDGAPUhWP+p7vXJtjvLUCBu18ahg/+TPQ9ygFWAue6+4cJtrsRuBo4FPgA+Jm7PxMuej4ETnX3pSHvIcD7QF9332hm5xL9rfoBy4h6mRaHvCVE38NLokXrCtyQ6Fghfw7wP0TfgW3AJKL67+LuNc2oixlEF2jfa+pv0BnoalSkjZjZQcAFRP/I6/0ByAU+A5xOFOiudPd3gW8Dc929m7v3dPfJwGOEVmsI5F2AqcCrRP8wrwUeMzOLOcb5wE1EXeHVwFzg7bD8N6J/mC1xLNHFAQDu/i+ibvGjG9lmHtDDzI4J/8wvIEmLGcDMuhEFh/qLjbOAp/eynMcBnmD9ZUQtu+5EwXsa8DhRPV4E3B16HwD+COwg6l35RvhpqZNCefoQBbP7zCwrpD1EdOFWABwPnE0U/Jsygeh79GmgN9F3Z3uSvP8iurDLBf4LeNTM8t29Gvg/YnpOiL47M0MgHw7cD1wTjvEn4LlwEVDvIuAcoKe71yQ7Vsh7NdHFxzCiC6pxceVsqi7eBfqZWY/GKqazUMtcJPWmmFkN0I2oy3g07GqZXEDUXbwN2GZmk4iCzH3N3PfJYb+3uXsd8LqZPU/0T/WWkOcZd38rHPMZopbgw2H5r7S8ZdMNKItbV0YUHBtT3zqfSdTKXZsgzw1m9j2iAPoP4Iqwvg+woT5TyHMr0f+yv7j71Qn21ZOo5RfvQXf/Z9jPl4ASd38gpL1tZk8DXzez5US9Gce5eyWw1MweAk5r4jyTec/d7w3HfYhoaCLPzHYSBbee7r4dqDSz3xFdcPypiX1+QhRgC0JL+a1kGd39qZjFv5rZT4ERwLNEFzOTgZ+F9Itjjn018Cd3nx+WHzKz/yD6Ds4M6+6MnQ/SxLHOB/63vsfEzG4jmgeBmeU1oy7q/6Y9gfLkVdM5KJiLpN44d38tBO+xwEwz+yywE9gfeC8m73vAEXux78OBD0IgT7aP2K7W7QmWu+3F8WJVAPGtoh4kDpyxHgFmEXX9P5wkz2+TTPIrJWodA+DudwF3mdmtROPriWwh8QVG7CTEvsBJZrY1Zt1+oayHhM+x+WP/Zntr18VIGEaA6G9wMNAFWB/TsZIdd9xkHiFqlT9hZj2Jejt+5u6fxGc0s8uB64m6yuuP3Sd8fh040MxOCuUcBjwT0voCE8zs2pjd7U/0HazXoKxNHOvwuPzxf4+m6qL+b7o1/hw7IwVzkTYSxvr+z8z+RDRZ6xmiFlVfovFHgKPY3VJN9ErD+HXrgE+bWXZMQD8KWNGaZU/in0Sz0AEws88QzS5v9Nju/p6ZrSYaC71qL485Hfgq0fhpcy0GfphgfWxdfkDUnTwqPlO4CKshCpbLw+qj9uL4zfUB0TBIn9BF3WwhaP8X8F9m1g94kagrv0EPj5n1JZqX8UWiIZzacDdBVthPnZk9SdSz8yHwfMz8hw+AX7n7rxopyq46bepYwHoaXoB9OuZzc+riGKLelE7fKgcFc5E2E8ZFvwL0At4N/9yeBH4VWjAHE7Vifhs2+RA40sz2d/ePY9Z9Jma384FK4N9DF/0XgPOAE1upzF2IJlRlA/uFiVafhAuTx4C5ZjaSaAz+F8D/NTL5LdZVQC93r9zLmc+3AP8wszuASe6+1sz6EP1jT3bcfwA9zewId0/UpQ/wPHCbmV0G1N+/PAyocPd3zez/gFvM7BtErcwJQMlelLtJ7r7ezF4FJoXJhBVEvRdHuvvMxrY1szOBzUQXheVEF4l7TJoDuhIF3E1huyuBwXF5HgemEPWC/Cxm/b3AM2b2GlGdHkQ0IXBWkr95U8d6EviBmb1A9B3+SX1CM+vidOClhBXSCWkCnEjqTQ0z0suBXwET6sdqiSasVQKrgDlE/0jvD2mvE7V+N5hZ/X3Z9wGfNbOtZjYlBPmvEI0vbiYaf728/p7sVnAvUVf8RUT/2LcTjekTzuHbREF9I1G358Tm7NTd/+Xub+5tYdx9BdEY7ZHAO2a2jWhW+jrg50m2+Zjo1rZLG9nvNqIJVheGfW0A/puopwGieQXdwvoH2buegb1xOVHX9TKi4YG/ETOs0IjDQt5yoolhM0kwsdDdlxHNGp9LdGF4HFH9xeapv0A8nJhgGf5eVwN3hbIVs3suwx6acax7iSZuLiaa4PgiUQ9I/UVIU3VxEU3PJeg0snbuTNSTJyLScYRbrGYTTTZMNstb0sjMvgzc4+59m5H3POAydz8/9SXLDArmIiLS5szsQOBMotZ5HtEth/Pc/bp0litTqZtdRETSIYto0t4Wom72d4Gb01qiDKaWuYiISIZTy1za2n5Es4F1J4WISPM1+r9T/1ClrfUlmgU7Ekj2hisREWnoSKJJnAVEj8ltQMFc2lr9rSWz01oKEZHMlI+CubQD6wG2bKmkrk7zNQB69+5GaWlFuovRbqg+dlNdNJTJ9dH9W1cCsG1yyx5RkJ2dRa9eXSH8D42nYC5trRagrm6ngnkM1UVDqo/dVBcNZWx9bIgeyd8K5U/0ZD9NgBMREcl0CuYiIiIZTsFcREQkwymYi4iIZDgFcxERkQynYC4iIpLhFMxFREQynIK5iIhIhlMwFxERyXB6ApykRe/e3dJdhHblkEO6p7sIrW5HdQ3byrfvWn733aW89dZ8KisrOPzwIznzzLPp1m3P816+fDkvvfQylZUV5OXlU1g4mh49ctuy6CIZR8Fc0uKqW19l45btTWeUjDV10li2hc8bN27gjTdeJT//CIYMOZ6iotnMmjWdMWPGNdimqqqSv/3tb/Tq1ZshQ4Yzf/4cpk9/mfHjL2jz8otkEnWzi0jKrVsXve322GOHMGTIcA455FBKSlaxY0fDC7qVK5dTW1vL8OEjGDLkePr3L2D9+rWUlW1NQ6lFMoeCuYik3IEHHgTA+vVr2bLlI8rKtgBQXl7eIF95eRkAXbtGwzD13fD160UkMXWzi0jKFRQczT//uXj
"text/plain": [
"<Figure size 432x360 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"low_mpg = merged.sort_values('mpg', ascending=False).tail(10)\n",
"\n",
"fig, ax = plt.subplots(figsize = (6,5))\n",
"ax.barh(low_mpg.car_name,low_mpg.mpg)\n",
"for i in ax.patches:\n",
" plt.text(i.get_width()+0.2, i.get_y()+0.5,\n",
" str(round((i.get_width()), 2)),\n",
" fontsize = 10, fontweight ='bold',\n",
" color ='grey')\n",
"ax.set_title('Bottom 10 MPG (red line is average)')\n",
"plt.axvline(merged.mpg.mean(),color='red')\n",
"plt.show();"
]
},
{
"cell_type": "markdown",
"id": "3d1c5be8-b63f-496e-a2cd-475a47e7a542",
"metadata": {},
"source": [
"In more detail:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "1497a48c-42a3-447e-b1fb-e3a5b78902da",
2022-07-21 16:31:53 -04:00
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:52.292148Z",
"iopub.status.busy": "2022-08-01T14:48:52.291798Z",
"iopub.status.idle": "2022-08-01T14:48:52.315086Z",
"shell.execute_reply": "2022-08-01T14:48:52.314262Z",
"shell.execute_reply.started": "2022-08-01T14:48:52.292121Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mpg</th>\n",
" <th>cylinders</th>\n",
" <th>displacement</th>\n",
" <th>horsepower</th>\n",
" <th>weight</th>\n",
" <th>acceleration</th>\n",
" <th>model_year</th>\n",
" <th>origin</th>\n",
" <th>car_name</th>\n",
" <th>efficiency</th>\n",
" <th>load</th>\n",
" <th>bore_size</th>\n",
" <th>grunt</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>28</th>\n",
" <td>9.0</td>\n",
" <td>8</td>\n",
" <td>304.0</td>\n",
" <td>193.0</td>\n",
" <td>4732.0</td>\n",
" <td>18.5</td>\n",
" <td>70</td>\n",
2022-07-21 16:31:53 -04:00
" <td>1</td>\n",
2022-08-01 09:32:07 -04:00
" <td>hi 1200d</td>\n",
" <td>0.634868</td>\n",
" <td>0.064243</td>\n",
" <td>38.000</td>\n",
2022-08-01 11:39:46 -04:00
" <td>3.224463</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>26</th>\n",
" <td>10.0</td>\n",
" <td>8</td>\n",
" <td>307.0</td>\n",
" <td>200.0</td>\n",
" <td>4376.0</td>\n",
2022-07-21 16:31:53 -04:00
" <td>15.0</td>\n",
2022-08-01 09:32:07 -04:00
" <td>70</td>\n",
2022-07-21 16:31:53 -04:00
" <td>1</td>\n",
2022-08-01 09:32:07 -04:00
" <td>chevy c20</td>\n",
" <td>0.651466</td>\n",
" <td>0.070155</td>\n",
" <td>38.375</td>\n",
2022-08-01 11:39:46 -04:00
" <td>3.395261</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>25</th>\n",
" <td>10.0</td>\n",
" <td>8</td>\n",
" <td>360.0</td>\n",
" <td>215.0</td>\n",
" <td>4615.0</td>\n",
" <td>14.0</td>\n",
" <td>70</td>\n",
" <td>1</td>\n",
" <td>ford f250</td>\n",
" <td>0.597222</td>\n",
" <td>0.078007</td>\n",
" <td>45.000</td>\n",
2022-08-01 11:39:46 -04:00
" <td>2.853395</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>27</th>\n",
" <td>11.0</td>\n",
" <td>8</td>\n",
" <td>318.0</td>\n",
" <td>210.0</td>\n",
" <td>4382.0</td>\n",
" <td>13.5</td>\n",
" <td>70</td>\n",
" <td>1</td>\n",
" <td>dodge d200</td>\n",
" <td>0.660377</td>\n",
" <td>0.072570</td>\n",
" <td>39.750</td>\n",
2022-08-01 11:39:46 -04:00
" <td>3.488786</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>103</th>\n",
" <td>11.0</td>\n",
" <td>8</td>\n",
" <td>400.0</td>\n",
" <td>150.0</td>\n",
" <td>4997.0</td>\n",
" <td>14.0</td>\n",
" <td>73</td>\n",
" <td>1</td>\n",
" <td>chevrolet impala</td>\n",
" <td>0.375000</td>\n",
" <td>0.080048</td>\n",
" <td>50.000</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.125000</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>67</th>\n",
" <td>11.0</td>\n",
" <td>8</td>\n",
" <td>429.0</td>\n",
" <td>208.0</td>\n",
" <td>4633.0</td>\n",
" <td>11.0</td>\n",
" <td>72</td>\n",
" <td>1</td>\n",
" <td>mercury marquis</td>\n",
" <td>0.484848</td>\n",
" <td>0.092597</td>\n",
" <td>53.625</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.880624</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>124</th>\n",
" <td>11.0</td>\n",
" <td>8</td>\n",
" <td>350.0</td>\n",
" <td>180.0</td>\n",
" <td>3664.0</td>\n",
" <td>11.0</td>\n",
" <td>73</td>\n",
" <td>1</td>\n",
" <td>oldsmobile omega</td>\n",
" <td>0.514286</td>\n",
" <td>0.095524</td>\n",
" <td>43.750</td>\n",
2022-08-01 11:39:46 -04:00
" <td>2.115918</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>42</th>\n",
" <td>12.0</td>\n",
" <td>8</td>\n",
" <td>383.0</td>\n",
" <td>180.0</td>\n",
" <td>4955.0</td>\n",
" <td>11.5</td>\n",
" <td>71</td>\n",
" <td>1</td>\n",
" <td>dodge monaco (sw)</td>\n",
" <td>0.469974</td>\n",
" <td>0.077296</td>\n",
" <td>47.875</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.767004</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>95</th>\n",
" <td>12.0</td>\n",
" <td>8</td>\n",
" <td>455.0</td>\n",
" <td>225.0</td>\n",
" <td>4951.0</td>\n",
" <td>11.0</td>\n",
" <td>73</td>\n",
" <td>1</td>\n",
" <td>buick electra 225 custom</td>\n",
" <td>0.494505</td>\n",
" <td>0.091901</td>\n",
" <td>56.875</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.956285</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" <tr>\n",
2022-08-01 09:32:07 -04:00
" <th>90</th>\n",
" <td>12.0</td>\n",
" <td>8</td>\n",
" <td>429.0</td>\n",
" <td>198.0</td>\n",
" <td>4952.0</td>\n",
" <td>11.5</td>\n",
" <td>73</td>\n",
" <td>1</td>\n",
" <td>mercury marquis brougham</td>\n",
" <td>0.461538</td>\n",
" <td>0.086632</td>\n",
" <td>53.625</td>\n",
2022-08-01 11:39:46 -04:00
" <td>1.704142</td>\n",
2022-07-21 16:31:53 -04:00
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" mpg cylinders displacement horsepower weight acceleration \\\n",
2022-08-01 09:32:07 -04:00
"28 9.0 8 304.0 193.0 4732.0 18.5 \n",
"26 10.0 8 307.0 200.0 4376.0 15.0 \n",
"25 10.0 8 360.0 215.0 4615.0 14.0 \n",
"27 11.0 8 318.0 210.0 4382.0 13.5 \n",
"103 11.0 8 400.0 150.0 4997.0 14.0 \n",
"67 11.0 8 429.0 208.0 4633.0 11.0 \n",
"124 11.0 8 350.0 180.0 3664.0 11.0 \n",
"42 12.0 8 383.0 180.0 4955.0 11.5 \n",
"95 12.0 8 455.0 225.0 4951.0 11.0 \n",
"90 12.0 8 429.0 198.0 4952.0 11.5 \n",
2022-07-21 16:31:53 -04:00
"\n",
2022-08-01 09:32:07 -04:00
" model_year origin car_name efficiency load \\\n",
"28 70 1 hi 1200d 0.634868 0.064243 \n",
"26 70 1 chevy c20 0.651466 0.070155 \n",
"25 70 1 ford f250 0.597222 0.078007 \n",
"27 70 1 dodge d200 0.660377 0.072570 \n",
"103 73 1 chevrolet impala 0.375000 0.080048 \n",
"67 72 1 mercury marquis 0.484848 0.092597 \n",
"124 73 1 oldsmobile omega 0.514286 0.095524 \n",
"42 71 1 dodge monaco (sw) 0.469974 0.077296 \n",
"95 73 1 buick electra 225 custom 0.494505 0.091901 \n",
"90 73 1 mercury marquis brougham 0.461538 0.086632 \n",
2022-07-21 16:31:53 -04:00
"\n",
2022-08-01 09:32:07 -04:00
" bore_size grunt \n",
2022-08-01 11:39:46 -04:00
"28 38.000 3.224463 \n",
"26 38.375 3.395261 \n",
"25 45.000 2.853395 \n",
"27 39.750 3.488786 \n",
"103 50.000 1.125000 \n",
"67 53.625 1.880624 \n",
"124 43.750 2.115918 \n",
"42 47.875 1.767004 \n",
"95 56.875 1.956285 \n",
"90 53.625 1.704142 "
2022-08-01 09:32:07 -04:00
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"merged.sort_values('mpg').head(10)"
]
},
{
"cell_type": "markdown",
"id": "15d0d27b-5f92-4648-ad5c-35cc811430b3",
"metadata": {},
"source": [
"## Some stats"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "8710cba8-6b7e-4219-98b9-b7d5a1b4f4b9",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:52.316598Z",
"iopub.status.busy": "2022-08-01T14:48:52.316192Z",
"iopub.status.idle": "2022-08-01T14:48:52.324768Z",
"shell.execute_reply": "2022-08-01T14:48:52.324017Z",
"shell.execute_reply.started": "2022-08-01T14:48:52.316570Z"
2022-08-01 09:32:07 -04:00
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mean MPG: 23.51\n",
"Mean Weight: 2975.41\n",
"Mean Horsepower: 104.12\n",
"efficiency mean: 0.61\n",
"load mean: 0.06\n",
"bore_size mean: 33.36\n",
2022-08-01 11:39:46 -04:00
"grunt mean: 1.92\n"
2022-08-01 09:32:07 -04:00
]
}
],
"source": [
"print(f'''Mean MPG: {y.mean():.2f}\n",
"Mean Weight: {merged.weight.mean():.2f}\n",
"Mean Horsepower: {merged.horsepower.mean():.2f}''')\n",
"\n",
"for col in merged.columns[9:]:\n",
" print(f'{col} mean: {merged[col].mean():.2f}')"
]
},
{
"cell_type": "markdown",
"id": "0213061d-29c8-4f47-9128-705253bc6320",
"metadata": {},
"source": [
"Check Correlation"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "7205bdab-a7df-41b4-9ec0-c1c9e2fe1c03",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:52.326254Z",
"iopub.status.busy": "2022-08-01T14:48:52.325834Z",
"iopub.status.idle": "2022-08-01T14:48:52.339352Z",
"shell.execute_reply": "2022-08-01T14:48:52.338521Z",
"shell.execute_reply.started": "2022-08-01T14:48:52.326227Z"
2022-08-01 09:32:07 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"weight -0.832707\n",
"displacement -0.804456\n",
"horsepower -0.777897\n",
"cylinders -0.776090\n",
"bore_size -0.773403\n",
"load -0.724271\n",
2022-08-01 11:39:46 -04:00
"grunt 0.180568\n",
2022-08-01 09:32:07 -04:00
"acceleration 0.420414\n",
"efficiency 0.509309\n",
"origin 0.563833\n",
"model_year 0.580091\n",
"mpg 1.000000\n",
"dtype: float64"
2022-07-21 16:31:53 -04:00
]
},
2022-08-01 09:32:07 -04:00
"execution_count": 20,
2022-07-21 16:31:53 -04:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
2022-08-01 09:32:07 -04:00
"merged.corrwith(y).sort_values()"
]
},
{
"cell_type": "markdown",
"id": "d8889b56-a87c-4901-b654-aaf5a4b9fb14",
"metadata": {},
"source": [
"<hr>\n",
"Math says to use weight, displacement, horsepower, cylinders...\n",
"\n",
"While I agree that these are the most important features, there's more to it than just these numbers. Like how a stew is not just a sum of its ingredients."
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "markdown",
"id": "27e89d6b-7603-403c-8235-e9bad49040b3",
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"I'll test both"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
2022-08-01 09:32:07 -04:00
"execution_count": 21,
2022-07-21 16:31:53 -04:00
"id": "52d0ffbf-55aa-49b9-b99f-8160bf09cc79",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:52.340843Z",
"iopub.status.busy": "2022-08-01T14:48:52.340494Z",
"iopub.status.idle": "2022-08-01T14:48:52.347303Z",
"shell.execute_reply": "2022-08-01T14:48:52.346318Z",
"shell.execute_reply.started": "2022-08-01T14:48:52.340816Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"Index(['mpg', 'cylinders', 'displacement', 'horsepower', 'weight',\n",
" 'acceleration', 'model_year', 'origin', 'car_name', 'efficiency',\n",
" 'load', 'bore_size', 'grunt'],\n",
" dtype='object')"
]
},
2022-08-01 09:32:07 -04:00
"execution_count": 21,
2022-07-21 16:31:53 -04:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"merged.columns"
]
},
{
"cell_type": "code",
2022-08-01 09:32:07 -04:00
"execution_count": 22,
2022-07-21 16:31:53 -04:00
"id": "6a4a9e48-57a1-48b6-b289-58bc43584112",
"metadata": {
"execution": {
2022-08-01 11:39:46 -04:00
"iopub.execute_input": "2022-08-01T14:48:52.352022Z",
"iopub.status.busy": "2022-08-01T14:48:52.351247Z",
"iopub.status.idle": "2022-08-01T14:48:52.367555Z",
"shell.execute_reply": "2022-08-01T14:48:52.366775Z",
"shell.execute_reply.started": "2022-08-01T14:48:52.351982Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [],
"source": [
2022-08-01 09:32:07 -04:00
"y.to_csv('data/y.csv',index=False)\n",
"\n",
"merged[[\\\n",
" 'horsepower',\n",
" 'bore_size',\n",
" 'grunt',\n",
" 'load',\n",
" ]].to_csv('data/X_engineered.csv',index=False)\n",
2022-07-21 16:31:53 -04:00
"\n",
2022-08-01 09:32:07 -04:00
"merged[[\\\n",
" 'horsepower',\n",
" 'weight',\n",
" 'displacement',\n",
" 'cylinders',\n",
" ]].to_csv('data/X_straight.csv',index=False)"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "markdown",
"id": "4802d1fd-079c-4053-88f2-b5dca7cf8dae",
"metadata": {},
"source": [
"[Modeling](model.ipynb)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}