2022-07-21 16:31:53 -04:00
{
"cells": [
{
"cell_type": "markdown",
"id": "9151a000-1923-408b-bd86-16008dc95f97",
"metadata": {},
"source": [
"[readme](readme.md)"
]
},
{
"cell_type": "markdown",
"id": "cecbac86-abb3-4f6b-a101-2d9324d96274",
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"# Cleaning\n",
"\n",
"Let's get this to something we can work with"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "markdown",
"id": "b67cb510-2df0-4ce4-a033-473710fdc749",
"metadata": {},
"source": [
"Load file and set column names"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "3c4bfade-d06d-4887-9eb4-ec7f5bc61625",
"metadata": {
"execution": {
2022-08-01 09:32:07 -04:00
"iopub.execute_input": "2022-08-01T00:18:59.316785Z",
"iopub.status.busy": "2022-08-01T00:18:59.315438Z",
"iopub.status.idle": "2022-08-01T00:19:00.307894Z",
"shell.execute_reply": "2022-08-01T00:19:00.307130Z",
"shell.execute_reply.started": "2022-08-01T00:18:59.316695Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [],
"source": [
"import pandas as pd\n",
2022-08-01 09:32:07 -04:00
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
2022-07-21 16:31:53 -04:00
"\n",
"df = pd.read_csv('data/auto-mpg.data',header=None,delim_whitespace=True)\n",
"df.columns = ['mpg','cylinders','displacement','horsepower','weight',\n",
" 'acceleration','model_year','origin','car_name']"
]
},
{
"cell_type": "markdown",
"id": "fdcec7e3-c65e-4d66-9a10-b500fb940234",
"metadata": {},
"source": [
"Attribute Information:\n",
"\n",
" 1. mpg: continuous\n",
" 2. cylinders: multi-valued discrete\n",
" 3. displacement: continuous\n",
" 4. horsepower: continuous\n",
" 5. weight: continuous\n",
" 6. acceleration: continuous\n",
" 7. model year: multi-valued discrete\n",
" 8. origin: multi-valued discrete\n",
" 9. car name: string (unique for each instance)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "62bbb6bd-b5b3-4d54-a132-23cd367c4570",
"metadata": {
"execution": {
2022-08-01 09:32:07 -04:00
"iopub.execute_input": "2022-08-01T00:19:00.311568Z",
"iopub.status.busy": "2022-08-01T00:19:00.310921Z",
"iopub.status.idle": "2022-08-01T00:19:00.322308Z",
"shell.execute_reply": "2022-08-01T00:19:00.321851Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.311524Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 398 entries, 0 to 397\n",
"Data columns (total 9 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 mpg 398 non-null float64\n",
" 1 cylinders 398 non-null int64 \n",
" 2 displacement 398 non-null float64\n",
" 3 horsepower 398 non-null object \n",
" 4 weight 398 non-null float64\n",
" 5 acceleration 398 non-null float64\n",
" 6 model_year 398 non-null int64 \n",
" 7 origin 398 non-null int64 \n",
" 8 car_name 398 non-null object \n",
"dtypes: float64(4), int64(3), object(2)\n",
"memory usage: 28.1+ KB\n"
]
}
],
"source": [
"df.info()"
]
},
{
"cell_type": "markdown",
"id": "6a4028ed-eda3-4c50-aed0-d9503d41a8e1",
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"No nulls, but why is horsepower not a number?"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "58fa2876-4ccb-4ef5-bc16-d25b74efb457",
"metadata": {
"execution": {
2022-08-01 09:32:07 -04:00
"iopub.execute_input": "2022-08-01T00:19:00.323107Z",
"iopub.status.busy": "2022-08-01T00:19:00.322921Z",
"iopub.status.idle": "2022-08-01T00:19:00.333299Z",
"shell.execute_reply": "2022-08-01T00:19:00.332860Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.323092Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"array(['130.0', '165.0', '150.0', '140.0', '198.0', '220.0', '215.0',\n",
" '225.0', '190.0', '170.0', '160.0', '95.00', '97.00', '85.00',\n",
" '88.00', '46.00', '87.00', '90.00', '113.0', '200.0', '210.0',\n",
" '193.0', '?', '100.0', '105.0', '175.0', '153.0', '180.0', '110.0',\n",
" '72.00', '86.00', '70.00', '76.00', '65.00', '69.00', '60.00',\n",
" '80.00', '54.00', '208.0', '155.0', '112.0', '92.00', '145.0',\n",
" '137.0', '158.0', '167.0', '94.00', '107.0', '230.0', '49.00',\n",
" '75.00', '91.00', '122.0', '67.00', '83.00', '78.00', '52.00',\n",
" '61.00', '93.00', '148.0', '129.0', '96.00', '71.00', '98.00',\n",
" '115.0', '53.00', '81.00', '79.00', '120.0', '152.0', '102.0',\n",
" '108.0', '68.00', '58.00', '149.0', '89.00', '63.00', '48.00',\n",
" '66.00', '139.0', '103.0', '125.0', '133.0', '138.0', '135.0',\n",
" '142.0', '77.00', '62.00', '132.0', '84.00', '64.00', '74.00',\n",
" '116.0', '82.00'], dtype=object)"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.horsepower.unique()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "2d99ea58-ca51-4461-a127-c6b389b056a1",
"metadata": {
"execution": {
2022-08-01 09:32:07 -04:00
"iopub.execute_input": "2022-08-01T00:19:00.334093Z",
"iopub.status.busy": "2022-08-01T00:19:00.333926Z",
"iopub.status.idle": "2022-08-01T00:19:00.347963Z",
"shell.execute_reply": "2022-08-01T00:19:00.347305Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.334077Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mpg</th>\n",
" <th>cylinders</th>\n",
" <th>displacement</th>\n",
" <th>horsepower</th>\n",
" <th>weight</th>\n",
" <th>acceleration</th>\n",
" <th>model_year</th>\n",
" <th>origin</th>\n",
" <th>car_name</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>32</th>\n",
" <td>25.0</td>\n",
" <td>4</td>\n",
" <td>98.0</td>\n",
" <td>?</td>\n",
" <td>2046.0</td>\n",
" <td>19.0</td>\n",
" <td>71</td>\n",
" <td>1</td>\n",
" <td>ford pinto</td>\n",
" </tr>\n",
" <tr>\n",
" <th>126</th>\n",
" <td>21.0</td>\n",
" <td>6</td>\n",
" <td>200.0</td>\n",
" <td>?</td>\n",
" <td>2875.0</td>\n",
" <td>17.0</td>\n",
" <td>74</td>\n",
" <td>1</td>\n",
" <td>ford maverick</td>\n",
" </tr>\n",
" <tr>\n",
" <th>330</th>\n",
" <td>40.9</td>\n",
" <td>4</td>\n",
" <td>85.0</td>\n",
" <td>?</td>\n",
" <td>1835.0</td>\n",
" <td>17.3</td>\n",
" <td>80</td>\n",
" <td>2</td>\n",
" <td>renault lecar deluxe</td>\n",
" </tr>\n",
" <tr>\n",
" <th>336</th>\n",
" <td>23.6</td>\n",
" <td>4</td>\n",
" <td>140.0</td>\n",
" <td>?</td>\n",
" <td>2905.0</td>\n",
" <td>14.3</td>\n",
" <td>80</td>\n",
" <td>1</td>\n",
" <td>ford mustang cobra</td>\n",
" </tr>\n",
" <tr>\n",
" <th>354</th>\n",
" <td>34.5</td>\n",
" <td>4</td>\n",
" <td>100.0</td>\n",
" <td>?</td>\n",
" <td>2320.0</td>\n",
" <td>15.8</td>\n",
" <td>81</td>\n",
" <td>2</td>\n",
" <td>renault 18i</td>\n",
" </tr>\n",
" <tr>\n",
" <th>374</th>\n",
" <td>23.0</td>\n",
" <td>4</td>\n",
" <td>151.0</td>\n",
" <td>?</td>\n",
" <td>3035.0</td>\n",
" <td>20.5</td>\n",
" <td>82</td>\n",
" <td>1</td>\n",
" <td>amc concord dl</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" mpg cylinders displacement horsepower weight acceleration \\\n",
"32 25.0 4 98.0 ? 2046.0 19.0 \n",
"126 21.0 6 200.0 ? 2875.0 17.0 \n",
"330 40.9 4 85.0 ? 1835.0 17.3 \n",
"336 23.6 4 140.0 ? 2905.0 14.3 \n",
"354 34.5 4 100.0 ? 2320.0 15.8 \n",
"374 23.0 4 151.0 ? 3035.0 20.5 \n",
"\n",
" model_year origin car_name \n",
"32 71 1 ford pinto \n",
"126 74 1 ford maverick \n",
"330 80 2 renault lecar deluxe \n",
"336 80 1 ford mustang cobra \n",
"354 81 2 renault 18i \n",
"374 82 1 amc concord dl "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[df.horsepower == '?']"
]
},
{
"cell_type": "markdown",
"id": "498d069d-b95e-43d6-bd3d-4b707fdd9635",
"metadata": {},
"source": [
2022-08-01 09:32:07 -04:00
"I'll fill in what I can with what I can find online"
2022-07-21 16:31:53 -04:00
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "e53a2eaf-a8f9-4d7e-bf8b-07a125cf6f06",
"metadata": {
"execution": {
2022-08-01 09:32:07 -04:00
"iopub.execute_input": "2022-08-01T00:19:00.348907Z",
"iopub.status.busy": "2022-08-01T00:19:00.348680Z",
"iopub.status.idle": "2022-08-01T00:19:00.352582Z",
"shell.execute_reply": "2022-08-01T00:19:00.351931Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.348891Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [],
"source": [
"# 1971 pinto kent I4\n",
"df.at[32,'horsepower'] = '75.0'\n",
"# 1974 maverick 200 I6\n",
"df.at[126,'horsepower'] = '85.0'\n",
"# 1980 renault lecar deluxe 85ci I4\n",
"df.at[330,'horsepower'] = '53.5'\n",
"# 1980 ford mustang cobra\n",
"# they seem confused between 2 different models\n",
"# 1981 renault 18i\n",
"df.at[354,'horsepower'] = '81.5'\n",
"#1982 AMC concord dl 151\n",
"df.at[374,'horsepower'] = '90'"
]
},
{
"cell_type": "markdown",
"id": "68d959c5-9628-437f-8f3f-0b4c7002b1f0",
"metadata": {},
"source": [
"We'll ignore the mustang because it's too far off from realistic, it looks like they got confused between two different models.\n",
"\n",
"Anyway, drop all '?' horsepower"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "10400330-e6aa-43e0-910f-f97869c23d0f",
"metadata": {
"execution": {
2022-08-01 09:32:07 -04:00
"iopub.execute_input": "2022-08-01T00:19:00.353597Z",
"iopub.status.busy": "2022-08-01T00:19:00.353430Z",
"iopub.status.idle": "2022-08-01T00:19:00.360958Z",
"shell.execute_reply": "2022-08-01T00:19:00.359990Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.353582Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [],
"source": [
"df.drop(df[df.horsepower == '?'].index,inplace=True)\n",
"df['horsepower'] = df.horsepower.astype(float)\n",
"df.reset_index(inplace=True,drop=True)"
]
},
{
"cell_type": "markdown",
"id": "b2afc76d-c428-4b81-9882-5ea19ecd04bb",
"metadata": {},
"source": [
"And set to floats, like the rest"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "e0fd9a7b-6cdf-4346-8c8d-6c5f36e167f6",
"metadata": {
"execution": {
2022-08-01 09:32:07 -04:00
"iopub.execute_input": "2022-08-01T00:19:00.365129Z",
"iopub.status.busy": "2022-08-01T00:19:00.364725Z",
"iopub.status.idle": "2022-08-01T00:19:00.373554Z",
"shell.execute_reply": "2022-08-01T00:19:00.372817Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.365100Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 397 entries, 0 to 396\n",
"Data columns (total 9 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 mpg 397 non-null float64\n",
" 1 cylinders 397 non-null int64 \n",
" 2 displacement 397 non-null float64\n",
" 3 horsepower 397 non-null float64\n",
" 4 weight 397 non-null float64\n",
" 5 acceleration 397 non-null float64\n",
" 6 model_year 397 non-null int64 \n",
" 7 origin 397 non-null int64 \n",
" 8 car_name 397 non-null object \n",
"dtypes: float64(5), int64(3), object(1)\n",
"memory usage: 28.0+ KB\n"
]
}
],
"source": [
"df.info()"
]
},
{
"cell_type": "markdown",
"id": "097c4cec-eb77-46f6-8eef-89eb7c47b425",
"metadata": {},
"source": [
"Looks good"
]
},
{
"cell_type": "markdown",
"id": "151e5f1b-6409-4972-9c79-a26d132eedf5",
"metadata": {},
"source": [
"### Min/Max to check range"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "769f33e7-2f2e-46e8-b6dd-8f8fb79d13b7",
"metadata": {
"execution": {
2022-08-01 09:32:07 -04:00
"iopub.execute_input": "2022-08-01T00:19:00.374606Z",
"iopub.status.busy": "2022-08-01T00:19:00.374357Z",
"iopub.status.idle": "2022-08-01T00:19:00.379646Z",
"shell.execute_reply": "2022-08-01T00:19:00.379055Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.374591Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"mpg\n",
"Min: 9.0 \n",
"Max: 46.6\n",
"\n",
"cylinders\n",
"Min: 3 \n",
"Max: 8\n",
"\n",
"displacement\n",
"Min: 68.0 \n",
"Max: 455.0\n",
"\n",
"horsepower\n",
"Min: 46.0 \n",
"Max: 230.0\n",
"\n",
"weight\n",
"Min: 1613.0 \n",
"Max: 5140.0\n",
"\n",
"acceleration\n",
"Min: 8.0 \n",
"Max: 24.8\n",
"\n",
"model_year\n",
"Min: 70 \n",
"Max: 82\n",
"\n",
"origin\n",
"Min: 1 \n",
"Max: 3\n",
"\n"
]
}
],
"source": [
"for col in df.columns[:-1]:\n",
" print(f'''{col}\n",
"Min: {df[col].min()} \n",
"Max: {df[col].max()}\n",
"''')"
]
},
{
"cell_type": "markdown",
"id": "59641984-e266-4eaa-a90d-af266cb95936",
"metadata": {},
"source": [
"All of this makes sense"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "7bac1a71-53d2-4081-b566-244bccd3a3c6",
"metadata": {
"execution": {
2022-08-01 09:32:07 -04:00
"iopub.execute_input": "2022-08-01T00:19:00.380599Z",
"iopub.status.busy": "2022-08-01T00:19:00.380414Z",
"iopub.status.idle": "2022-08-01T00:19:00.387779Z",
"shell.execute_reply": "2022-08-01T00:19:00.387145Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.380583Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"array(['datsun pl510', 'amc gremlin', 'chevrolet chevelle malibu',\n",
" 'chevrolet impala', 'ford galaxie 500', 'plymouth fury iii',\n",
" 'pontiac catalina', 'amc matador', 'amc hornet', 'ford maverick',\n",
" 'plymouth duster', 'chevrolet vega', 'ford pinto',\n",
" 'toyota corolla 1200', 'ford gran torino', 'ford gran torino (sw)',\n",
" 'amc matador (sw)', 'opel manta', 'toyota corona', 'fiat 128',\n",
" 'chevrolet nova', 'ford ltd', 'volkswagen dasher', 'datsun 710',\n",
" 'audi 100ls', 'peugeot 504', 'saab 99le', 'opel 1900',\n",
" 'dodge colt', 'chevrolet chevelle malibu classic',\n",
" 'plymouth valiant', 'honda civic', 'volkswagen rabbit',\n",
" 'toyota corolla', 'toyota mark ii', 'chevrolet caprice classic',\n",
" 'chevrolet chevette', 'honda civic cvcc', 'chevrolet malibu',\n",
" 'chevrolet monte carlo landau', 'buick estate wagon (sw)',\n",
" 'ford country squire (sw)', 'oldsmobile cutlass salon brougham',\n",
" 'vw rabbit', 'chevrolet citation', 'amc concord', 'dodge aspen',\n",
" 'datsun 210', 'subaru dl', 'buick skylark', 'plymouth reliant',\n",
" 'subaru', 'mazda 626', 'buick century', 'pontiac phoenix',\n",
" 'honda accord'], dtype=object)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[df.car_name.duplicated()].car_name.unique()"
]
},
{
"cell_type": "markdown",
"id": "81b8d5a5-d323-4a70-b951-b2fe4fb1e35f",
"metadata": {},
"source": [
"There are some duplicate car names, honestly I wish there were more. If I had a bunch of data with lots of duplicate car names it'd actually be easier to predict MPG I imagine, I'll say more on this later but there are some big factors that aren't represented here."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "87715776-3634-4ca7-bbb4-e04633fe4791",
"metadata": {
"execution": {
2022-08-01 09:32:07 -04:00
"iopub.execute_input": "2022-08-01T00:19:00.389256Z",
"iopub.status.busy": "2022-08-01T00:19:00.388803Z",
"iopub.status.idle": "2022-08-01T00:19:00.417680Z",
"shell.execute_reply": "2022-08-01T00:19:00.417117Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.389227Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mpg</th>\n",
" <th>cylinders</th>\n",
" <th>displacement</th>\n",
" <th>horsepower</th>\n",
" <th>weight</th>\n",
" <th>acceleration</th>\n",
" <th>model_year</th>\n",
" <th>origin</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>397.000000</td>\n",
" <td>397.000000</td>\n",
" <td>397.000000</td>\n",
" <td>397.000000</td>\n",
" <td>397.000000</td>\n",
" <td>397.000000</td>\n",
" <td>397.000000</td>\n",
" <td>397.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>23.514358</td>\n",
" <td>5.458438</td>\n",
" <td>193.560453</td>\n",
" <td>104.123426</td>\n",
" <td>2970.589421</td>\n",
" <td>15.571285</td>\n",
" <td>76.000000</td>\n",
" <td>1.574307</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>7.825846</td>\n",
" <td>1.701577</td>\n",
" <td>104.366796</td>\n",
" <td>38.396800</td>\n",
" <td>847.903955</td>\n",
" <td>2.760431</td>\n",
" <td>3.696846</td>\n",
" <td>0.802549</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>9.000000</td>\n",
" <td>3.000000</td>\n",
" <td>68.000000</td>\n",
" <td>46.000000</td>\n",
" <td>1613.000000</td>\n",
" <td>8.000000</td>\n",
" <td>70.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>17.500000</td>\n",
" <td>4.000000</td>\n",
" <td>104.000000</td>\n",
" <td>75.000000</td>\n",
" <td>2223.000000</td>\n",
" <td>13.800000</td>\n",
" <td>73.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>23.000000</td>\n",
" <td>4.000000</td>\n",
" <td>151.000000</td>\n",
" <td>92.000000</td>\n",
" <td>2800.000000</td>\n",
" <td>15.500000</td>\n",
" <td>76.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>29.000000</td>\n",
" <td>8.000000</td>\n",
" <td>262.000000</td>\n",
" <td>125.000000</td>\n",
" <td>3609.000000</td>\n",
" <td>17.200000</td>\n",
" <td>79.000000</td>\n",
" <td>2.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>46.600000</td>\n",
" <td>8.000000</td>\n",
" <td>455.000000</td>\n",
" <td>230.000000</td>\n",
" <td>5140.000000</td>\n",
" <td>24.800000</td>\n",
" <td>82.000000</td>\n",
" <td>3.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" mpg cylinders displacement horsepower weight \\\n",
"count 397.000000 397.000000 397.000000 397.000000 397.000000 \n",
"mean 23.514358 5.458438 193.560453 104.123426 2970.589421 \n",
"std 7.825846 1.701577 104.366796 38.396800 847.903955 \n",
"min 9.000000 3.000000 68.000000 46.000000 1613.000000 \n",
"25% 17.500000 4.000000 104.000000 75.000000 2223.000000 \n",
"50% 23.000000 4.000000 151.000000 92.000000 2800.000000 \n",
"75% 29.000000 8.000000 262.000000 125.000000 3609.000000 \n",
"max 46.600000 8.000000 455.000000 230.000000 5140.000000 \n",
"\n",
" acceleration model_year origin \n",
"count 397.000000 397.000000 397.000000 \n",
"mean 15.571285 76.000000 1.574307 \n",
"std 2.760431 3.696846 0.802549 \n",
"min 8.000000 70.000000 1.000000 \n",
"25% 13.800000 73.000000 1.000000 \n",
"50% 15.500000 76.000000 1.000000 \n",
"75% 17.200000 79.000000 2.000000 \n",
"max 24.800000 82.000000 3.000000 "
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.describe()"
]
},
{
"cell_type": "markdown",
"id": "90fe9344-fe47-4503-be59-74d9d38cf1d3",
"metadata": {},
"source": [
"Everything looks proportional"
]
},
2022-08-01 09:32:07 -04:00
{
"cell_type": "code",
"execution_count": 11,
"id": "3f68b5d6-15c7-4fe0-aa49-a04ab90c4efa",
"metadata": {
"execution": {
"iopub.execute_input": "2022-08-01T00:19:00.418521Z",
"iopub.status.busy": "2022-08-01T00:19:00.418342Z",
"iopub.status.idle": "2022-08-01T00:19:00.645493Z",
"shell.execute_reply": "2022-08-01T00:19:00.644587Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.418505Z"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAWAAAAFwCAYAAACGt6HXAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAAXzElEQVR4nO3dfbRddX3n8fcnCYgVVCiBhpCIDyxHllOwKzoanPqAzsSHCl2jRKs2dWiDy+poZergQ6c6bWcY2zp22qklRSSitSDqgOjQYpSqCxcYBQoMVFyIISQmAUUBHTTkO3+cHb0rvffmJJx9fyf3vF9rnXX2429/7865n/zu75y9T6oKSdLcW9C6AEmaVAawJDViAEtSIwawJDViAEtSIwawJDViAOthS/JXSX5vRG0tT3J/koXd/FVJfnMUbXft/Z8ka0bV3j4c9w+T3J3kO3N9bI0vA1izSnJHkh8luS/JvUmuTvL6JD997VTV66vqD4Zs6wWzbVNVm6rq0Kp6aAS1vzvJR/Zo/0VVtf7htr2PdSwDzgJOqKpfmGb9c5NUkk/usfzEbvlVU5ZVkge6/6TuSvK+3f9ZdetfmeSabpvt3fQbkqTHH1H7yQDWMH6lqg4DHgecA/wn4IOjPkiSRaNuc0w8DrinqrbPss0OYGWSn5+ybA3wjWm2PbGqDgVOAX4N+C2AJGcBfwb8MfALwNHA64GTgYMf7g+h0TOANbSq+n5VXQasBtYkeSpAkguS/GE3fWSSy7ve8neTfCnJgiQXAsuBT3e9t7clOa7r0Z2RZBPw+SnLpobxE5Ncm+T7SS5NckR3rOcm2Ty1xt297CSrgHcAq7vj3dCt/+mQRlfXu5J8u+stfjjJY7p1u+tYk2RTN3zwzpnOTZLHdPvv6Np7V9f+C4ArgWO6Oi6YoYkfA/8beGXX3kLgdOCjs/x73Ap8CXhqV/d/Ad5QVZdU1X01cF1VvbqqHpypHbVjAGufVdW1wGbgX0+z+qxu3WIGPbB3DHap1wKbGPSmD62q907Z5znAU4B/O8Mhfx3498AxwE7gfw5R4xXAfwUu6o534jSb/Ub3eB7wBOBQ4C/22ObZwJMZ9Db/c5KnzHDIPwce07XznK7m11XV54AXAVu6On5jlrI/3O0Hg3NxM7Blpo2TnMDg3+A64FnAI4BLZ2lfY8YA1v7aAhwxzfKfAEuAx1XVT6rqS7X3G468u6oeqKofzbD+wqq6qaoeAH4POH3quOfD8GrgfVV1e1XdD7wdeOUeve/3VNWPquoG4AbgnwV5V8tq4O1dz/MO4E+B1+5LMVV1NXBEkiczCOIPz7Dp15N8D/g0cB7wIeBI4O6q2jmlrqu7v0R+lOSX96UWzQ0DWPtrKfDdaZb/MfBN4O+T3J7k7CHaunMf1n8bOIhB4Dxcx3TtTW17EYOe+25TP7XwQwa95D0dyWCMdc+2lu5HTRcCb2TQK//UDNv8UlUdXlVPrKp3VdUu4B7gyKn/eVTVyqp6bLfO3/Ux5D+K9lmSpzMIly/vua7rAZ5VVU8AfgV4a5JTdq+eocm99ZCXTZlezqCXfTfwAPBzU+payGDoY9h2tzB4g2xq2zuBbXvZb093dzXt2dZd+9gODAL4DcBnq+qH+7DfV4AHgVP345hqxADW0JI8OslLgb8FPlJVN06zzUuTPKn72NMPgIe6BwyC7Qn7cejXJDkhyc8xeKPpku5jat8ADknykiQHAe9iMA662zbguKkfmdvDx4DfSfL4JIfyszHjnTNsP62ulouBP0pyWJLHAW8FPjL7ntO29S0GY8gzvuE3w373Au8B/jLJy5Mc2r0JeBLwqH2tQ3PDANYwPp3kPgZDAe8E3ge8boZtjwc+B9zPoFf2l1V1VbfuvwHv6sYl/+M+HP9C4AIGwwGHAP8BBp/KYNBbPI9Bb/MBBm8A7vbx7vmeJF+fpt3zu7a/CHwL+H/Am/ahrqne1B3/dgZ/GfxN1/4+q6ovV9WMb77Nst97GQT/24DtDP4DOpfBxwav3p9a1K94Q3ZJasMesCQ1YgBLUiMGsCQ1YgBLUiMHxM1PVq1aVVdccUXrMiRpf017N7oDogd89913ty5BkkbugAhgSZqPDGBJasQAlqRGDGBJasQAlqRGDGBJasQAlqRGDGBJasQAlqRGDGBJasQAlqRGDGBJasQAlqRGDOB5bOmy5SQZ6WPpsuWtfyxp3jgg7ges/bNl852sPne0X4Z70ZkrR9qeNMnsAUtSIwawJDViAEtSIwawJDXSawAneWySS5LcmuSWJM9KckSSK5Pc1j0f3mcNkjSu+u4B/xlwRVX9C+BE4BbgbGBDVR0PbOjmJWni9BbASR4N/DLwQYCq+nFV3QucCqzvNlsPnNZXDZI0zvrsAT8B2AF8KMl1Sc5L8ijg6KraCtA9HzXdzknWJtmYZOOOHTt6LFOS2ugzgBcBvwR8oKqeBjzAPgw3VNW6qlpRVSsWL17cV42S1EyfAbwZ2FxV13TzlzAI5G1JlgB0z9t7rEGSxlZvAVxV3wHuTPLkbtEpwP8FLgPWdMvWAJf2VYMkjbO+7wXxJuCjSQ4GbgdexyD0L05yBrAJeEXPNUjSWOo1gKvqemDFNKtO6fO4knQg8Eo4SWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgxgSWrEAJakRgzgMbB02XKSjPwhabwtal2AYMvmO1l97tUjb/eiM1eOvE1Jo2MPWJIaMYAlqREDWJIaMYAlqREDWJIaMYAlqREDWJIaMYDVXF8Xoixdtrz1jybNygsx1JwXomhS9RrASe4A7gMeAnZW1YokRwAXAccBdwCnV9X3+qxDksbRXAxBPK+qTqqqFd382cCGqjoe2NDNS9LEaTEGfCqwvpteD5zWoAZJaq7vAC7g75N8LcnabtnRVbUVoHs+arodk6xNsjHJxh07dvRcpiTNvb7fhDu5qrYkOQq4Msmtw+5YVeuAdQArVqyovgqUpFZ67QFX1ZbueTvwKeAZwLYkSwC65+191iBJ46q3AE7yqCSH7Z4G/g1wE3AZsKbbbA1waV81SNI463MI4mjgU903MywC/qaqrkjyVeDiJGcAm4BX9FiDJI2t3gK4qm4HTpxm+T3AKX0dV5IOFF6KLEmNGMCS1IgBLEmNGMCS1IgBLEmNGMCS1IgBLEmNGMCS1IgBLEmNGMCS1IgBLEmNGMCS1IgBLEmNGMD7aOmy5SQZ6UPSZOr7K4nmnS2b72T1uVePtM2Lzlw50vYkHRjsAUtSIwawJDViAEtSIwawJDViAEtSIwawJDViAEtSIwawJDViAEtSIwawJDViAEtSIwawJDViAEtSIwawJDViAEtSIwawJDViAEtSIwawJDViAEtSIwawJDViAEtSIwaw5q8Fi0gy0sfSZctb/1SaR/xaes1fu3ay+tyrR9rkRWeuHGl7mmy994CTLExyXZLLu/kjklyZ5Lbu+fC+a5CkcTQXQxBvBm6ZMn82sKGqjgc2dPOSNHF6DeAkxwIvAc6bsvhUYH03vR44rc8aJGlc9T0G/H7gbcBhU5YdXVVbAapqa5KjptsxyVpgLcDy5b7xMTa6N7YkPXy9BXCSlwLbq+prSZ67r/tX1TpgHcCKFStqtNVpv/nGljQyffaATwZeluTFwCHAo5N8BNiWZEnX+10CbO+xBkkaW72NAVfV26vq2Ko6Dngl8Pmqeg1wGbCm22wNcGlfNUjSOGtxIcY5wAuT3Aa8sJuXpIkzJxdiVNVVwFXd9D3AKXNxXEkaZ16KLEmNGMCS1IgBLEmNGMCS1IgBLEmNGMCS1IgBLEmNGMCS1IgBLEmNGMCS1IgBLEmNGMCS1IgBLEmNGMDSvui+kmmUj6XL/MqtSTUnt6OU5g2/kkkjZA9YkhoxgCWpkaECOMnJwyyTJA1v2B7wnw+5TJI0pFnfhEvyLGAlsDjJW6esejSwsM/CJGm+29u
"text/plain": [
"<Figure size 360x360 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.displot(x=df.mpg)\n",
"plt.title('Distribution of MPG')\n",
"plt.show();"
]
},
{
"cell_type": "markdown",
"id": "506e5033-b3da-4624-bbc5-44f3d159d9e1",
"metadata": {},
"source": [
"Most MPG is around 20"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "4c705ec7-9106-4f3c-a65b-5081fe1ded59",
"metadata": {
"execution": {
"iopub.execute_input": "2022-08-01T00:19:00.647104Z",
"iopub.status.busy": "2022-08-01T00:19:00.646709Z",
"iopub.status.idle": "2022-08-01T00:19:00.732581Z",
"shell.execute_reply": "2022-08-01T00:19:00.732044Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.647075Z"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAggAAAHFCAYAAACXYgGUAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAeF0lEQVR4nO3de5TU5X348c9w2YsIG7kuCCJq1HgjEbQBPBFEbAiHRD0aDRKXENNKhKK2aWtyEs3FYG0kNccYzcULTdDmopZK1mrCxRpMhCiFpKmpOQjmAMGz0bCAS2F5fn/4Y8rysLDC4uwOr9c5c87ud2a+8zz7sDtvvt+Z3UJKKQUAwB66lHoAAEDHIxAAgIxAAAAyAgEAyAgEACAjEACAjEAAADICAQDICAQAICMQYD8eeOCBKBQKLS79+vWLsWPHxuOPP17q4RUdf/zxMW3atLd8v23btsUtt9wSS5YsafcxvfzyyzFp0qTo3bt3FAqFuP7661u97fHHHx+FQiHGjh27z+vnzZtX/PrvOdZbbrmlxdpUVFTEsGHDYvbs2fH6669n+1m1alV8/OMfjxNPPDGqq6ujuro63vnOd8Zf/uVfxooVKw5twlBmupV6ANAZ3H///XHqqadGSik2btwYd911V0yePDkWLFgQkydPLvXwDtq2bdvi85//fEREq0/OB+uGG26IX/ziF3HfffdFbW1tDBw4cL+379mzZzz99NPxu9/9Lk488cQW1913333Rq1ev2Lx58z7v+8QTT0RNTU00NjbGj3/847jzzjvjueeei2XLlkWhUIiIiHvvvTdmzpwZp5xySsyePTtOP/30KBQK8Zvf/CYeeuihOOecc+Kll17KHhuOVAIB2uCMM86IkSNHFj9///vfH8ccc0w89NBDnToQDqdf/epXce6558bFF1/cptufd955sXr16rjvvvvi1ltvLW7/3e9+F08//XRcc8018a1vfWuf9x0xYkT07ds3IiImTJgQDQ0N8c///M+xbNmyGDNmTPzsZz+LT37ykzFp0qT44Q9/GBUVFcX7XnDBBXHdddfFD37wg6iurj74CUOZcYoBDkJVVVVUVFRE9+7dW2z/4x//GJ/85Cfj2GOPjYqKijjhhBPiM5/5TGzfvj0iIpqamuI973lPnHTSSfGnP/2peL+NGzdGbW1tjB07NpqbmyMiYtq0aXH00UfHr3/96xg/fnz06NEj+vXrFzNnzoxt27YdcIzr1q2LqVOnRv/+/aOysjLe9a53xR133BG7du2KiDdPAfTr1y8iIj7/+c8XD9Mf6FTFgfa7ZMmSKBQK8dJLL0V9fX1xvy+//PJ+99ulS5e4+uqr48EHHyzuK+LNowdDhgyJCy+88IBz3u29731vRESsXbs2IiK+/OUvR9euXePee+9tEQd7uvzyy2PQoEFtfgwodwIB2qC5uTl27twZO3bsiN///vdx/fXXx9atW2PKlCnF2zQ1NcW4ceNi3rx5ceONN8bChQtj6tSpcfvtt8ell14aEW+Gxfe///3YtGlTTJ8+PSIidu3aFVdddVWklOKhhx6Krl27Fve5Y8eO+MAHPhDjx4+Pxx57LGbOnBn33ntvXHHFFfsd76uvvhqjR4+OJ598Mr74xS/GggUL4sILL4y/+Zu/iZkzZ0ZExMCBA+OJJ56IiIiPf/zj8eyzz8azzz4bn/3sZw9pv2effXY8++yzUVtbG2PGjCnu90CnGCIipk+fHuvXr49///d/L37dH3zwwZg2bVp06dL2H1cvvfRSRET069cvmpubY/HixTFy5Mg2jQH4/xLQqvvvvz9FRHaprKxMd999d4vb3nPPPSki0ve///0W2//hH/4hRUR68skni9v+5V/+JUVE+qd/+qf0uc99LnXp0qXF9SmlVFdXlyIi3XnnnS2233rrrSki0jPPPFPcNnTo0FRXV1f8/O///u9TRKRf/OIXLe47Y8aMVCgU0osvvphSSunVV19NEZFuvvnmNn092rrf3WOaNGlSm/a7523PP//8dNlll6WUUlq4cGEqFAppzZo16Qc/+EGKiLR48eLi/W6++eYUEWnjxo1px44d6bXXXkvf/e53U3V1dRoyZEh644030saNG1NEpCuvvDJ73J07d6YdO3YUL7t27WrTeOFI4AgCtMG8efNi+fLlsXz58qivr4+6urq47rrr4q677ireZtGiRdGjR4+47LLLWtx39yH7n/70p8VtH/7wh2PGjBnxqU99Kr70pS/Fpz/96ZgwYcI+H/uqq65q8fnuoxaLFy9udbyLFi2K0047Lc4999xsLCmlWLRo0YEn/Tbud0/Tp0+PBQsWRENDQ3znO9+JcePGxfHHH7/f+9TW1kb37t3jmGOOialTp8bZZ58dTzzxRFRVVe33fiNGjIju3bsXL3fcccchjx/KhRcpQhu8613vyl6kuHbt2vjbv/3bmDp1arzjHe+IhoaGqK2tLb5qfrf+/ftHt27doqGhocX26dOnxze+8Y2oqKiIv/qrv9rn43br1i369OnTYlttbW1ERLa/PTU0NOzzSXX3Ofb93Xd/Dtd+93TZZZfFrFmz4qtf/Wr827/9WzzwwAMHvM9PfvKTqKmpie7du8fgwYNbfM369u0b1dXVxdcj7Gn+/Pmxbdu22LBhQ3zwgx885LFDOXEEAQ7SWWedFW+88Ub89re/jYiIPn36xB/+8IdIKbW43aZNm2Lnzp3FV9lHRGzdujU++tGPxsknnxzV1dVxzTXX7PMxdu7cmT3pbty4sfh4renTp09s2LAh275+/fqIiBZjeSsO1373dNRRR8WVV14Zc+bMiR49ehRfv7E/w4cPj5EjR8bw4cOzr0vXrl3jggsuiBUrVmRjP+2002LkyJFx5plnHvK4odwIBDhIK1eujIgovhNg/PjxsWXLlnjsscda3G7evHnF63e79tprY926dfHII4/Ed77znViwYEF89atf3efjfO9732vx+fz58yNi/7+3YPz48fFf//Vf8fzzz2djKRQKMW7cuIiIqKysjIiIN954Yz8zfev7PVQzZsyIyZMnx+c+97kDniZoi5tuuimam5vj2muvjR07drTDCKH8OcUAbfCrX/0qdu7cGRFvHkZ/5JFH4qmnnopLLrkkhg0bFhERV199dXz961+Purq6ePnll+PMM8+MZ555Jr785S/HBz7wgeLb9L797W/Hd7/73bj//vvj9NNPj9NPPz1mzpwZf/d3fxdjxoxpcX6/oqIi7rjjjtiyZUucc845sWzZsvjSl74UEydOjPPOO6/V8d5www0xb968mDRpUnzhC1+IoUOHxsKFC+Puu++OGTNmxMknnxwRb/5yoqFDh8a//uu/xvjx46N3797Rt2/fVs/5t3W/h+rd7353FlqHYsyYMfH1r389Zs2aFWeffXb8xV/8RZx++unRpUuX2LBhQ/zoRz+KiIhevXq122NCp1fiF0lCh7avdzHU1NSkd7/73Wnu3Lmpqampxe0bGhrStddemwYOHJi6deuWhg4dmm666abi7VatWpWqq6tbvOMgpZSamprSiBEj0vHHH59ee+21lNKb72Lo0aNHWrVqVRo7dmyqrq5OvXv3TjNmzEhbtmxpcf+938WQUkpr165NU6ZMSX369Endu3dPp5xySvrHf/zH1Nzc3OJ2P/nJT9J73vOeVFlZmSIi28/e2rrfg30XQ2v29y6GV199tU2Ps3LlyvSxj30sDRs2LFVWVqaqqqp00kknpauvvjr99Kc/bdM+4EhRSGmvE6ZAhzBt2rT44Q9/GFu2bCn1UIAjkNcgAAAZgQAAZJxiAAAyjiAAABmBAABkBAIAkDnoX5S0a9euWL9+ffTs2TP73fMAQMeUUorGxsYYNGjQfv+M+kEHwvr162PIkCEHe3cAoIReeeWVGDx4cKvXH3Qg9OzZs/gAfj0pAHQOmzdvjiFDhhSfx1tz0IGw+7RCr169BAIAdDIHenmAFykCABmBAABkBAIAkBEIAEBGIAAAGYEAAGQEAgCQEQgAQEYgAAAZgQAAZAQCAJARCABARiAAABmBAABkBAIAkBEIAEBGIAAAGYEAAGQ
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.boxplot(x=df.mpg)\n",
"plt.title('Boxplot of MPG')\n",
"plt.show();"
]
},
{
"cell_type": "markdown",
"id": "6ab5a1cd-177f-4dcb-bd79-e9ad395d51d7",
"metadata": {},
"source": [
"There's one value considered an outlier:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "655067f4-4c89-4e65-b65a-c1a1e4158535",
"metadata": {
"execution": {
"iopub.execute_input": "2022-08-01T00:19:00.733588Z",
"iopub.status.busy": "2022-08-01T00:19:00.733352Z",
"iopub.status.idle": "2022-08-01T00:19:00.742775Z",
"shell.execute_reply": "2022-08-01T00:19:00.741517Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.733572Z"
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mpg</th>\n",
" <th>cylinders</th>\n",
" <th>displacement</th>\n",
" <th>horsepower</th>\n",
" <th>weight</th>\n",
" <th>acceleration</th>\n",
" <th>model_year</th>\n",
" <th>origin</th>\n",
" <th>car_name</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>322</th>\n",
" <td>46.6</td>\n",
" <td>4</td>\n",
" <td>86.0</td>\n",
" <td>65.0</td>\n",
" <td>2110.0</td>\n",
" <td>17.9</td>\n",
" <td>80</td>\n",
" <td>3</td>\n",
" <td>mazda glc</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" mpg cylinders displacement horsepower weight acceleration \\\n",
"322 46.6 4 86.0 65.0 2110.0 17.9 \n",
"\n",
" model_year origin car_name \n",
"322 80 3 mazda glc "
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[df.mpg > 45]"
]
},
{
"cell_type": "markdown",
"id": "f89f5906-7b78-4268-933a-ccf02e151b85",
"metadata": {},
"source": [
"I'm going to leave this in because it's a real value. I guess it appears as an outlier because the data set is so small"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "2720ba7e-272d-4cf7-810c-0a7ec7ad2e58",
"metadata": {
"execution": {
"iopub.execute_input": "2022-08-01T00:19:00.744397Z",
"iopub.status.busy": "2022-08-01T00:19:00.744001Z",
"iopub.status.idle": "2022-08-01T00:19:00.899639Z",
"shell.execute_reply": "2022-08-01T00:19:00.899044Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.744367Z"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAH+CAYAAABTKk23AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA5yElEQVR4nO3deVhWdf7/8dctyg0YoIBsI4uaW+CS2qi0COGGS6aWWzWaS864pCmXZVZCX5OxJnNG07JJ1Fy/Mz81zRa3tEWbMRx3MysVa0DSFEQJt/P7o4v72y2ggrfcH+z5uK5zXZzP53M+532OXL445z73fdssy7IEAACMVMXdBQAAgNIR1AAAGIygBgDAYAQ1AAAGI6gBADAYQQ0AgMEIagAADEZQAwBgMIIaAACDEdSodObPny+bzeZYvLy8FBoaqoSEBKWlpSknJ6fYNikpKbLZbGXaz7lz55SSkqLNmzeXabuS9hUdHa1u3bqVaZ5rWbJkiWbMmFFin81mU0pKikv352obN25Uq1atVL16ddlsNq1ateqq448fP65nnnlGTZo00W233SYvLy/Vr19fY8aM0aFDh8q8/0GDBik6OtqpLTo6WoMGDSrzXFcTHx+v+Ph4l86J35aq7i4AKK/09HQ1atRIFy5cUE5Ojj777DNNmzZNf/nLX7R8+XK1b9/eMXbo0KHq3LlzmeY/d+6cUlNTJalM/9GWZ1/lsWTJEu3du1djx44t1rdt2zbVrl37ptdQXpZlqU+fPmrQoIFWr16t6tWrq2HDhqWO//e//61u3brJsiyNGjVKbdu2laenpw4ePKhFixbp97//vU6dOnXDda1cuVJ+fn43PA/gSgQ1Kq3Y2Fi1atXKsd67d2899dRTuueee9SrVy8dOnRIISEhkqTatWvf9OA6d+6cfHx8KmRf19KmTRu37v9a/vvf/+qnn35Sz549lZiYeNWxeXl56tGjh7y8vLR161ancxsfH6/hw4frn//8p0vquvPOO10yz81iWZZ+/vlneXt7u7sUVCBufeOWEhkZqVdffVVnzpzRm2++6Wgv6Xb0pk2bFB8fr8DAQHl7eysyMlK9e/fWuXPndOTIEdWqVUuSlJqa6rjNXnRbtGi+HTt26KGHHlLNmjVVr169UvdVZOXKlWratKm8vLxUt25d/e1vf3PqL7qtf+TIEaf2zZs3y2azOW7Dx8fHa+3atTp69KjTywBFSrr1vXfvXvXo0UM1a9aUl5eXmjdvrgULFpS4n6VLl2rSpEkKDw+Xn5+f2rdvr4MHD5Z+4n/ls88+U2Jionx9feXj46O4uDitXbvW0Z+SkuII26efflo2m63YLehfe+utt5Sdna2XX3651D+AHnroIUnSO++8I5vNpm3bthUb8+KLL6patWr673//W+q+rrz1XZbzYVmWXn75ZUVFRcnLy0stWrTQBx98UOJ+8vLylJycrDp16sjT01O/+93vNHbsWJ09e9ZpnM1m06hRo/TGG2+ocePGstvtjn+zOXPmqFmzZrrtttvk6+urRo0a6dlnny312FB5cUWNW06XLl3k4eGhTz75pNQxR44cUdeuXXXvvfdq3rx5qlGjhn744Qd9+OGHOn/+vMLCwvThhx+qc+fOGjJkiIYOHSpJjvAu0qtXL/Xr109//OMfi/0ne6WdO3dq7NixSklJUWhoqBYvXqwxY8bo/PnzSk5OLtMxzp49W0888YS+/fZbrVy58prjDx48qLi4OAUHB+tvf/ubAgMDtWjRIg0aNEjHjx/XhAkTnMY/++yzuvvuu/X3v/9deXl5evrpp9W9e3cdOHBAHh4epe5ny5Yt6tChg5o2baq3335bdrtds2fPVvfu3bV06VL17dtXQ4cOVbNmzdSrVy+NHj1aAwYMkN1uL3XOdevWycPDQ927d7/mcfbt21cTJkzQ66+/rrZt2zraL168qDfffFM9e/ZUeHj4Nee50vWcj9TUVKWmpmrIkCF66KGHdOzYMQ0bNkyXLl1yuq1/7tw5tWvXTt9//72effZZNW3aVPv27dMLL7ygPXv2aMOGDU5/dK1atUqffvqpXnjhBYWGhio4OFjLli3TiBEjNHr0aP3lL39RlSpV9M0332j//v1lPjZUAhZQyaSnp1uSrO3bt5c6JiQkxGrcuLFjffLkydavf93/+c9/WpKsnTt3ljrHjz/+aEmyJk+eXKyvaL4XXnih1L5fi4qKsmw2W7H9dejQwfLz87POnj3rdGyHDx92Gvfxxx9bkqyPP/7Y0da1a1crKiqqxNqvrLtfv36W3W63MjMzncYlJSVZPj4+1unTp53206VLF6dx//u//2tJsrZt21bi/oq0adPGCg4Ots6cOeNou3jxohUbG2vVrl3bunz5smVZlnX48GFLkvXKK69cdT7LsqxGjRpZoaGh1xxXZPLkyZanp6d1/PhxR9vy5cstSdaWLVscbQMHDix2/qKioqyBAwc61q/3fJw6dcry8vKyevbs6TTu888/tyRZ7dq1c7SlpaVZVapUKfb7W/Q7+f777zvaJFn+/v7WTz/95DR21KhRVo0aNa59MnBL4NY3bknWNb5mvXnz5vL09NQTTzyhBQsW6LvvvivXfnr37n3dY2NiYtSsWTOntgEDBigvL087duwo1/6v16ZNm5SYmKiIiAin9kGDBuncuXPFbhU/8MADTutNmzaVJB09erTUfZw9e1b/+te/9NBDD+m2225ztHt4eOixxx7T999/f923z2/En/70J0m/3DIvMmvWLDVp0kT33Xdfuea81vnYtm2bfv75Zz3yyCNO4+Li4hQVFeXU9t577yk2NlbNmzfXxYsXHUunTp2cXt4ocv/996tmzZpObb///e91+vRp9e/fX++++65OnDhRruNC5UBQ45Zz9uxZnTx58qq3OOvVq6cNGzYoODhYI0eOVL169VSvXj399a9/LdO+wsLCrntsaGhoqW0nT54s037L6uTJkyXWWnSOrtx/YGCg03rRremCgoJS93Hq1ClZllWm/VyPyMhI/fjjj9d8aaFISEiI+vbtqzfffFOXLl3S7t279emnn2rUqFFl3neRa52PouO62r9xkePHj2v37t2qVq2a0+Lr6yvLsoqFbknn87HHHtO8efN09OhR9e7dW8HBwWrdurXWr19f7mOEuQhq3HLWrl2rS5cuXfMtVffee6/WrFmj3NxcffHFF2rbtq3Gjh2rZcuWXfe+yvLe7Ozs7FLbioLAy8tLklRYWOg07kavmAIDA5WVlVWsvejBqqCgoBuaX5Jq1qypKlWquHw/nTp10qVLl7RmzZrr3mbMmDE6duyY3n33Xc2aNUs1atQodrXrSkX/flf7Ny4SFBSkJk2aaPv27SUuzz//vNP40n7HHn/8cW3dulW5ublau3atLMtSt27drnrXA5UTQY1bSmZmppKTk+Xv76/hw4df1zYeHh5q3bq1Xn/9dUly3Ia+nqvIsti3b5927drl1LZkyRL5+vqqRYsWkuR4+nn37t1O41avXl1sPrvdft21JSYmatOmTcWeeF64cKF8fHxc8nau6tWrq3Xr1lqxYoVTXZcvX9aiRYtUu3ZtNWjQoMzzDhkyRKGhoZowYYJ++OGHEsesWLHCab1ly5aKi4vTtGnTtHjxYg0aNEjVq1cv876vV5s2beTl5aXFixc7tW/durVYcHbr1k3ffvutAgMD1apVq2LL1Z6AL0n16tWVlJSkSZMm6fz589q3b9+NHg4Mw1PfqLT27t3reH0vJydHn376qdLT0+Xh4aGVK1cWe0L719544w1t2rRJXbt2VWRkpH7++WfNmzdPkhwflOLr66uoqCi9++67SkxMVEBAgIKCgsr8H2mR8PBwPfDAA0pJSVFYWJgWLVqk9evXa9q0afLx8ZEk3XXXXWrYsKGSk5N18eJF1axZUytXrtRnn31WbL4mTZpoxYoVmjNnjlq2bKkqVao4va/81yZPnqz33ntPCQkJeuGFFxQQEKDFixdr7dq1evnll+Xv71+uY7pSWlqaOnTooISEBCUnJ8vT01OzZ8/W3r17tXTp0jJ/Opwk+fv7691331W3bt1
"text/plain": [
"<Figure size 500x500 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.displot(x=df.cylinders)\n",
"plt.title('Distribution of Cylinders')\n",
"plt.show();"
]
},
{
"cell_type": "markdown",
"id": "7d99f70d-728f-4df3-a178-023617e642af",
"metadata": {},
"source": [
"4 Cylinder engines outnumber the others by a lot. It would be nice if we had more info to go off of, particularly if 6 cylinders could be split between inline and V configurations. It's less important for the others cause while inline-8s and V4s exist they're so uncommon in cars of this vintage that we can assume they don't exist. They'll get different fuel economy but not by enough to sway things at the level of accuracy we're at. Inline-6 vs V6 though I think there could be something to see there and it could improve accuracy slightly"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "8fecb4c7-cdda-4922-8018-3e5160f6ecf1",
"metadata": {
"execution": {
"iopub.execute_input": "2022-08-01T00:19:00.900607Z",
"iopub.status.busy": "2022-08-01T00:19:00.900368Z",
"iopub.status.idle": "2022-08-01T00:19:01.039878Z",
"shell.execute_reply": "2022-08-01T00:19:01.039298Z",
"shell.execute_reply.started": "2022-08-01T00:19:00.900591Z"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAH+CAYAAABTKk23AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA5aUlEQVR4nO3de1xVVf7/8fdR4QiKeOeiAmp4y2vqmFaDV7xrw1Q66qRNNjZeEs00xymxmcHS0Zgyc5wpdXLMpiYbJ8vEG1ZqP0XNy6jppGIGEYaCQiCwfn/04Hw9AiqInkW+no/Hfjzaa6+992edRbzd52zOdhhjjAAAgJUqeboAAABQMoIaAACLEdQAAFiMoAYAwGIENQAAFiOoAQCwGEENAIDFCGoAACxGUAMAYDGCGrfE8uXL5XA4XEvVqlUVGBioHj16aO7cuUpNTS2yT0xMjBwOR6nOk5WVpZiYGG3durVU+xV3rrCwMA0aNKhUx7mWVatWKS4urthtDodDMTEx5Xq+8rZp0yZ16tRJ1apVk8Ph0HvvvVdsv5MnT7rNt5eXl+rUqaPOnTtrypQpOnToUJF9tm7dKofDUeq5u16FNS1fvvymHL+iW7x4Ma+NpQhq3FLLli3Tjh07FB8fr1deeUXt27fXCy+8oJYtW2rjxo1ufceOHasdO3aU6vhZWVmaM2dOqX/Zl+VcZXG1oN6xY4fGjh1702soK2OMHnroIXl5eWnt2rXasWOHIiIirrrPpEmTtGPHDiUkJOiNN97Q/fffr7Vr16pdu3aaP3++W9+77rpLO3bs0F133XUzh4ESENT2quLpAnB7ad26tTp16uRa//nPf64pU6bo3nvvVVRUlI4dO6aAgABJUsOGDdWwYcObWk9WVpZ8fX1vybmu5e677/bo+a/l66+/1nfffaef/exn6tWr13XtExIS4jauAQMGaOrUqYqKitL06dPVunVr9e/fX5JUo0YN618DwBO4oobHhYSEaMGCBcrMzNRf/vIXV3txb0dv3rxZ3bt3V506deTj46OQkBD9/Oc/V1ZWlk6ePKl69epJkubMmeN623XMmDFux9uzZ48eeOAB1apVS02bNi3xXIXWrFmjtm3bqmrVqmrSpIleeuklt+2Fb+ufPHnSrf3Kt3K7d++udevW6dSpU25vCxcq7q3vgwcPaujQoapVq5aqVq2q9u3ba8WKFcWe580339SsWbMUHBysGjVqqHfv3jp69GjJL/xlPvnkE/Xq1Ut+fn7y9fVVt27dtG7dOtf2mJgY1z9kZsyYIYfDobCwsOs69pV8fHz02muvycvLy+2quri3vr/88ksNHz5cwcHBcjqdCggIUK9evbRv3z5Xn8KPKK41T8U5fvy4HnnkEYWHh8vX11cNGjTQ4MGDdeDAgSJ9z507pyeffFJNmjSR0+lU/fr1NWDAAB05csTVJzc3V3/4wx/UokULOZ1O1atXT4888oi+/fZbt2MV1vz++++rQ4cO8vHxUcuWLfX+++9L+uFnqmXLlqpWrZp+8pOfaPfu3UXq2b17t4YMGaLatWuratWq6tChg/75z3+69Sn82dyyZYt+85vfqG7duqpTp46ioqL09ddfu9Vz6NAhJSQkuH4uyzq/KH9cUcMKAwYMUOXKlbVt27YS+5w8eVIDBw7Ufffdp9dff101a9bUmTNntH79euXm5iooKEjr169Xv3799Oijj7reRi4M70JRUVEaPny4Hn/8cV28ePGqde3bt0/R0dGKiYlRYGCg/vGPf2jy5MnKzc3VtGnTSjXGxYsX69e//rX+97//ac2aNdfsf/ToUXXr1k3169fXSy+9pDp16mjlypUaM2aMvvnmG02fPt2t/29/+1vdc889+tvf/qaMjAzNmDFDgwcP1uHDh1W5cuUSz5OQkKA+ffqobdu2eu211+R0OrV48WINHjxYb775poYNG6axY8eqXbt2ioqK0qRJkzRixAg5nc5Sjf9ywcHB6tixo7Zv3668vDxVqVL8r6IBAwYoPz9f8+bNU0hIiNLS0rR9+3adO3fOrV9Z5+nrr79WnTp19Pzzz6tevXr67rvvtGLFCnXp0kV79+5V8+bNJUmZmZm69957dfLkSc2YMUNdunTRhQsXtG3bNiUnJ6tFixYqKCjQ0KFD9fHHH2v69Onq1q2bTp06pdmzZ6t79+7avXu3fHx8XOf+/PPPNXPmTM2aNUv+/v6aM2eOoqKiNHPmTG3atEmxsbFyOByaMWOGBg0apBMnTrj237Jli/r166cuXbpoyZIl8vf31+rVqzVs2DBlZWW5/nFaaOzYsRo4cKBWrVql06dP66mnntKoUaO0efNmST/8Y/SBBx6Qv7+/Fi9eLEk3NL8oZwa4BZYtW2YkmV27dpXYJyAgwLRs2dK1Pnv2bHP5j+g777xjJJl9+/aVeIxvv/3WSDKzZ88usq3weM8++2yJ2y4XGhpqHA5HkfP16dPH1KhRw1y8eNFtbCdOnHDrt2XLFiPJbNmyxdU2cOBAExoaWmztV9Y9fPhw43Q6TVJSklu//v37G19fX3Pu3Dm38wwYMMCt3z//+U8jyezYsaPY8xW6++67Tf369U1mZqarLS8vz7Ru3do0bNjQFBQUGGOMOXHihJFk5s+ff9XjXW/fYcOGGUnmm2++cRtH4euVlpZmJJm4uLirnut656mwpmXLlpV4rLy8PJObm2vCw8PNlClTXO3PPfeckWTi4+NL3PfNN980ksy//vUvt/Zdu3YZSWbx4sVuNfv4+JivvvrK1bZv3z4jyQQFBblqNsaY9957z0gya9eudbW1aNHCdOjQwVy6dMntXIMGDTJBQUEmPz/fGPN/P5vjx4936zdv3jwjySQnJ7va7rzzThMREVHi+OA5vPUNa5hrPBq9ffv28vb21q9//WutWLFCX375ZZnO8/Of//y6+955551q166dW9uIESOUkZGhPXv2lOn812vz5s3q1auXGjVq5NY+ZswYZWVlFbn5bciQIW7rbdu2lSSdOnWqxHNcvHhRn332mR544AFVr17d1V65cmX98pe/1FdffXXdb5+X1rXmu3bt2mratKnmz5+vhQsXau/evSooKCi2b1nnKS8vT7GxsWrVqpW8vb1VpUoVeXt769ixYzp8+LCr34cffqhmzZqpd+/eJR7r/fffV82aNTV48GDl5eW5lvbt2yswMLDIDY7t27dXgwYNXOstW7aU9MNHJL6+vkXaC+fx+PHjOnLkiEaOHOkaQ+EyYMAAJScnF5mzsvxswB4ENaxw8eJFnT17VsHBwSX2adq0qTZu3Kj69etrwoQJatq0qZo2bao///nPpTpXUFDQdfcNDAwsse3s2bOlOm9pnT17tthaC1+jK89fp04dt/XCty6zs7NLPEd6erqMMaU6T3k5deqUnE6nateuXex2h8OhTZs2qW/fvpo3b57uuusu1atXT0888YQyMzPd+pZ1nqZOnapnnnlG999/v/7zn//os88+065du9SuXTu31+3bb7+95s2G33zzjc6dOydvb295eXm5LSkpKUpLS3Prf+W4vb29r9r+/fffu84jSdOmTStynvHjx0tSkXOV5WcD9uAzalhh3bp1ys/PV/fu3a/a77777tN9992n/Px87d69Wy+//LKio6MVEBCg4cOHX9e5SvO32SkpKSW2Ff7yq1q1qiQpJyfHrd+VvyxLq06dOkpOTi7SXngTUN26dW/o+JJUq1YtVapU6aaf50pnzpxRYmKiIiIiSvx8WpJCQ0P12muvSZK++OIL/fOf/1RMTIxyc3O1ZMkSV7/rmafirFy5Ug8//LBiY2Pd2tPS0lSzZk3Xer169fTVV19ddUyFN2qtX7++2O1+fn5X3f96Fc7HzJkzFRUVVWyfws/W8ePAFTU8LikpSdOmTZO/v7/GjRt3XftUrlxZXbp00SuvvCJJrrc3y/tK4dChQ/r888/d2latWiU/Pz/X3/sW3h27f/9+t35r164tcjyn03ndtfXq1UubN292uztXkv7+97/L19e3XP6UqVq1aurSpYveffddt7oKCgq0cuVKNWzYUM2aNbvh81wuOztbY8e
"text/plain": [
"<Figure size 500x500 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.displot(x=df.displacement)\n",
"plt.title('Distribution of Displacement')\n",
"plt.show();"
]
},
{
"cell_type": "markdown",
"id": "3235dc9c-d7d4-4795-8fd2-1d208be182c7",
"metadata": {},
"source": [
"Most engines in the data are smaller since most of our engines are 4 cylinders. The 3 groups seen here are the split between 4, 6, and 8 cylinders because they all come in generally the same sizes for automotive applications"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "b7b4c6b4-c46c-4f61-9e2a-1243ccc39d0a",
"metadata": {
"execution": {
"iopub.execute_input": "2022-08-01T00:19:01.041100Z",
"iopub.status.busy": "2022-08-01T00:19:01.040804Z",
"iopub.status.idle": "2022-08-01T00:19:01.123396Z",
"shell.execute_reply": "2022-08-01T00:19:01.122989Z",
"shell.execute_reply.started": "2022-08-01T00:19:01.041075Z"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAgsAAAHFCAYAAAB8VbqXAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAmjklEQVR4nO3deXSV9Z3H8c8NSW42iAQCSQSSIA0iEGGI2ogtCFZ2tAyi1Qq4HUQWQdRKGQdwGVE2Fyy21iLSGcJIAREImyzWBYosA6JFqGxTNmVEAphAku/80ZOnXJL8WBK4yeX9Oifn5D7Pc5/7+92f8b65S+IzMxMAAEA5woI9AAAAULURCwAAwIlYAAAATsQCAABwIhYAAIATsQAAAJyIBQAA4EQsAAAAJ2IBAAA4EQuoUt5++235fL6Ar8TERLVv314LFiwI9vA8aWlp6t+//3lf78SJExozZoxWrVpV6WPatWuXunXrpoSEBPl8Pg0bNqzcY9PS0rz7NywsTPHx8WrWrJn69u2rpUuXlnkdn8+nMWPGVPq4Tx/Thdynl4NFixZd1PseOJvwYA8AKMu0adN09dVXy8x04MABTZkyRT169ND8+fPVo0ePYA/vgp04cUJjx46VJLVv375Szz18+HCtXbtWf/jDH5SUlKTk5GTn8W3bttWECRMkSceOHdO2bduUk5OjTp066V//9V81c+ZMRUREeMd/+umnatCgQaWOGedm0aJFev311wkGBA2xgCqpRYsWysrK8i537txZtWvX1syZM6t1LFxMn3/+ua6//nrdfvvt53T8FVdcoR//+Mfe5VtuuUWDBg3SmDFjNHbsWP3bv/2bXnzxRW//6ccCuLzwMgSqhaioKEVGRgb8S1eS/u///k+PPPKIrrzySkVGRqpx48YaNWqUCgoKJEn5+flq3bq1mjRpou+//9673oEDB5SUlKT27durqKhIktS/f3/FxcVp69at6tixo2JjY5WYmKjBgwfrxIkTZx3jnj179Mtf/lL16tWT3+9Xs2bNNHHiRBUXF0v6x8sEiYmJkqSxY8d6LwOc7an3s5131apV8vl82rFjh3Jzc73z7tq165zu2zONGTNGzZs315QpU5Sfn+9tP/NliBMnTujxxx9Xenq6oqKilJCQoKysLM2cOdM7piL3aX5+vkaMGKFWrVopPj5eCQkJys7O1nvvvVfq2OLiYr322mtq1aqVoqOjvRCaP39+wHGzZs1Sdna2YmNjFRcXp06dOmnjxo0Bx5SM+a9//as6deqk2NhYJScna9y4cZKkNWvW6KabblJsbKwyMjI0ffr0UuM5cOCABgwYoAYNGigyMlLp6ekaO3asCgsLvWN27doln8+nCRMmaNKkSUpPT1dcXJyys7O1Zs2agPG8/vrr3hpUdH2BC2JAFTJt2jSTZGvWrLFTp07ZyZMnbe/evTZ06FALCwuzxYsXe8f+8MMPlpmZabGxsTZhwgRbunSpPf300xYeHm5du3b1jvvqq6+sZs2a1qtXLzMzKyoqsg4dOli9evVs37593nH9+vWzyMhIa9SokT3//PO2dOlSGzNmjIWHh1v37t0Dxpmammr9+vXzLh86dMiuvPJKS0xMtDfeeMMWL15sgwcPNkk2cOBAMzPLz8+3xYsXmyR74IEH7NNPP7VPP/3UduzYUe79cS7n/f777+3TTz+1pKQka9u2rXfe/Pz8cs+bmppq3bp1K3f/U089ZZLsz3/+s7dNko0ePdq7PGDAAIuJibFJkybZypUrbcGCBTZu3Dh77bXXKuU+PXLkiPXv399mzJhhK1assMWLF9vjjz9uYWFhNn369IDr3nvvvebz+ezBBx+09957z3Jzc+3555+3V155xTvm+eefN5/PZ/fff78tWLDA5syZY9nZ2RYbG2tbt24tNeZmzZrZK6+8YsuWLbP77rvPJNnIkSMtIyPD3nrrLVuyZIl1797dJNlnn33mXX///v3WsGFDS01Ntd/+9re2fPlye/bZZ83v91v//v2943bu3GmSLC0tzTp37mzz5s2zefPmWcuWLa127dp25MgRMzPbsWOH9e7d2yR5a3u29QUqG7GAKqUkFs788vv99pvf/Cbg2DfeeMMk2X//938HbH/xxRdNki1dutTbNmvWLJNkL7/8sv37v/+7hYWFBew3+8eDhKSABxizfzzISLKPPvrI23bmA1vJg+vatWsDrjtw4EDz+Xy2bds2MzP75ptvSj3oupzreUvG5AqA053t2KlTp5okmzVrlrftzHG3aNHCbr/9duftVOQ+PVNhYaGdOnXKHnjgAWvdurW3/cMPPzRJNmrUqHKvu2fPHgsPD7chQ4YEbM/Ly7OkpCTr06dPqTH/6U9/8radOnXKEhMTTZJt2LDB23748GGrUaOGPfbYY962AQMGWFxcnO3evTvgtiZMmGCSvDApiYWWLVtaYWGhd9xf/vIXk2QzZ870tg0aNMj4tx2CiZchUCW98847WrdundatW6fc3Fz169dPgwYN0pQpU7xjVqxYodjYWPXu3TvguiVP63/wwQfetj59+mjgwIF64okn9Nxzz+nXv/61fvazn5V52/fcc0/A5bvvvluStHLlynLHu2LFCl1zzTW6/vrrS43FzLRixYqzT/oSnvdszOysx1x//fXKzc3VU089pVWrVumHH34o99gLuU8l6d1331Xbtm0VFxen8PBwRURE6K233tKXX37pHZObmytJGjRoULnnWbJkiQoLC9W3b18VFhZ6X1FRUWrXrl2pT6f4fD517drVuxweHq4mTZooOTlZrVu39rYnJCSoXr162r17t7dtwYIFuvnmm5WSkhJwW126dJEkrV69OuC2unXrpho1aniXMzMzJSngnECw8QZHVEnNmjUr9QbH3bt368knn9Qvf/lLXXHFFTp8+LCSkpLk8/kCrluvXj2Fh4fr8OHDAdvvv/9+TZ06VZGRkRo6dGiZtxseHq46deoEbEtKSpKkUuc73eHDh5WWllZqe0pKylmv63Kxzns2JQ9UJbdTlldffVUNGjTQrFmz9OKLLyoqKkqdOnXS+PHj9aMf/cg77kLv0zlz5qhPnz6644479MQTTygpKUnh4eGaOnWq/vCHP3jHffPNN6pRo4Z3zrIcPHhQknTdddeVuT8sLPDfTTExMYqKigrYFhkZqYSEhFLXjYyMDHhvx8GDB/X++++Xen9NiW+//Tbg8pn3jd/vlyRnfAGXGrGAaiMzM1NLlizRV199peuvv1516tTR2rVrZWYBwXDo0CEVFhaqbt263rbjx4/r3nvvVUZGhg4ePKgHH3ywzDfKFRYW6vDhwwH/Az9w4ICk0v9TP12dOnW0f//+Utv37dsnSQFjOR8X67wuZqb3339fsbGxAcF2ptjYWI0dO1Zjx47VwYMHvWcZevToob/+9a/ecRd6n/7xj39Uenq6Zs2aFbC+JW9eLZGYmKiioiIdOHCg3I+LltxPs2fPVmpqqmP2FVe3bl1lZmbq+eefL3O/K8CAqoqXIVBtbNq0SZK8TxR07NhRx44d07x58wKOe+edd7z9JR5++GHt2bNHc+bM0VtvvaX58+dr8uTJZd7Of/7nfwZc/q//+i9J7t+L0LFjR33xxRfasGFDqbH4fD7dfPPNks7/X43net7KNHbsWH3xxRd69NFHS/3rujz169dX//799Ytf/ELbtm0r9UmHC7lPfT6fIiMjA0LhwIEDpSKv5On9qVOnlnuuTp06KTw8XH/729+UlZVV5ldl6d69uz7//HNdddVVZd7OhcQCzzYg2HhmAVXS559/7n3M7PDhw5ozZ46WLVumn//850pPT5ck9e3bV6+//rr69eunXbt2qWXLlvroo4/0H//xH+ratatuueUWSdLvf/97/fGPf9S0adPUvHlzNW/eXIMHD9avfvUrtW3bNuD9AJGRkZo4caKOHTum6667Tp988omee+45denSRTfddFO54x0+fLjeeecddevWTc8884xSU1O1cOFC/eY3v9HAgQOVkZEhSapZs6ZSU1P13nvvqWPHjkpISFDdunXLfKnhfM57IY4cOeJ9RO/48eP
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.boxplot(x=df.displacement)\n",
"plt.title('Boxplot of Displacement')\n",
"plt.show();"
]
},
{
"cell_type": "markdown",
"id": "78eff4eb-407f-4d7b-9c7d-42fde78b4cf4",
"metadata": {},
"source": [
"Again most engines are on the smaller side of the spectrum ranging from around 100ci to around 260ci which is representative of the market"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "0235d7ee-8882-490c-b209-8d2d2fbacee5",
"metadata": {
"execution": {
"iopub.execute_input": "2022-08-01T00:19:01.124186Z",
"iopub.status.busy": "2022-08-01T00:19:01.124021Z",
"iopub.status.idle": "2022-08-01T00:19:01.256006Z",
"shell.execute_reply": "2022-08-01T00:19:01.255440Z",
"shell.execute_reply.started": "2022-08-01T00:19:01.124159Z"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAH+CAYAAABTKk23AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAzHElEQVR4nO3de1TVVf7/8dcR4QgKeEFuKYqGecFbYipZYCpdtJs1XdSWfZsau2ha41RmJbZK0iZjyrKvM2W2+lrNlJYzNo14gTKpTMVbaPYNxUpClACFMGH//vDH+XoEVBA4G3g+1vqs5dmffT6f9z47e/k5Z5/zcRhjjAAAgJVaeLoAAABQPYIaAACLEdQAAFiMoAYAwGIENQAAFiOoAQCwGEENAIDFCGoAACxGUAMAYDGCGk3Cm2++KYfD4dpatWql0NBQjRgxQklJScrNza30nMTERDkcjhqdp7i4WImJiUpNTa3R86o6V9euXTV27NgaHedsli1bpuTk5Cr3ORwOJSYm1un56tratWsVExOj1q1by+Fw6MMPP6zU59ChQ2rRooXuu+++SvumTZsmh8OhmTNnVtr3+9//Xl5eXsrPzz/neir+u9q3b19NhiFJSk1NlcPh0Pvvv3/WvmeaN4CgRpOyZMkSpaenKyUlRa+88ooGDBigefPmqVevXlqzZo1b37vvvlvp6ek1On5xcbHmzJlT46Cuzblq40z/w09PT9fdd99d7zXUljFGt9xyi7y9vbVy5Uqlp6crLi6uUr+OHTuqT58+Wr9+faV9qampat26dbX7BgwYoHbt2p1zTWPGjFF6errCwsJqNpgaIqhxJgQ1mpTo6GgNHTpUl112mW666Sa9+OKL2r59u1q3bq1x48bp559/dvXt1KmThg4dWq/1FBcXN9i5zmbo0KHq1KmTR2s4k59++klHjhzRjTfeqJEjR2ro0KHVhuqIESO0Z88e5eTkuNqOHDmiHTt26L777tPmzZtVVFTk2vfDDz/o+++/14gRI2pUU8eOHTV06FA5nc7aDQqoAwQ1mryIiAi98MILKioq0n//93+72qt6O3rdunWKj49Xhw4d5Ovrq4iICN10000qLi7Wvn371LFjR0nSnDlzXG+z33nnnW7H27Jli26++Wa1a9dO3bt3r/ZcFVasWKF+/fqpVatW6tatm1566SW3/dW9/Vrx1mrF1X18fLxWrVql/fv3u30MUKGqt7537typ66+/Xu3atVOrVq00YMAALV26tMrzvPPOO5o1a5bCw8MVEBCgUaNGac+ePdW/8KfYsGGDRo4cKX9/f/n5+Sk2NlarVq1y7U9MTHT9I+LRRx+Vw+FQ165dqz1eReCe+s5GWlqaWrZsqRkzZkiSPvvsM9e+iivsU4N6zZo1GjlypAICAuTn56dLL71Ua9eudTtPVa+9MUZz585Vly5d1KpVK8XExCglJUXx8fGKj4+vVOtvv/12xtftbPMGENRoFq655hp5eXnp008/rbbPvn37NGbMGPn4+OiNN97QJ598oueee06tW7fW8ePHFRYWpk8++UTSyc8709PTlZ6erieffNLtOOPGjdOFF16of/zjH3rttdfOWFdGRoamT5+uhx56SCtWrFBsbKymTZumP//5zzUe46uvvqpLL71UoaGhrtrO9Hb7nj17FBsbq127dumll17S8uXL1bt3b915552aP39+pf6PP/649u/fr7/97W9avHix9u7dq2uvvVZlZWVnrCstLU1XXHGFCgoK9Prrr+udd96Rv7+/rr32Wr333nuSTn40sHz5cknS1KlTlZ6erhUrVlR7zLi4OLVo0cLtLe7169crJiZGISEhGjRokFuIr1+/Xl5eXrrsssskSW+//bYSEhIUEBCgpUuX6u9//7vat2+vK6+8slJYn27WrFmaNWuWrrrqKn300Ue69957dffdd+vbb7+tsv/ZXreazhuaIQM0AUuWLDGSzKZNm6rtExISYnr16uV6PHv2bHPqX4H333/fSDIZGRnVHuPQoUNGkpk9e3alfRXHe+qpp6rdd6ouXboYh8NR6XyjR482AQEB5tixY25jy8rKcuu3fv16I8msX7/e1TZmzBjTpUuXKms/ve7bbrvNOJ1Ok52d7dbv6quvNn5+fuaXX35xO88111zj1u/vf/+7kWTS09OrPF+FoUOHmuDgYFNUVORqO3HihImOjjadOnUy5eXlxhhjsrKyjCTz/PPPn/F4FQYMGGB69Ojhety3b1/z2GOPGWOMeeSRR0xMTIxrX2RkpLnkkkuMMcYcO3bMtG/f3lx77bVuxysrKzP9+/d39TOm8mt/5MgR43Q6za233ur23PT0dCPJxMXFudpq8rqdad4ArqjRbJiz3Hp9wIAB8vHx0R/+8ActXbpU33//fa3Oc9NNN51z3z59+qh///5ubePHj1dhYaG2bNlSq/Ofq3Xr1mnkyJHq3LmzW/udd96p4uLiSld11113ndvjfv36SZL2799f7TmOHTumL7/8UjfffLPatGnjavfy8tIdd9yhH3744ZzfPj/diBEj9O233+qnn37S4cOHtXPnTtdbz3Fxcdq6dasKCgqUnZ2trKws19veGzdu1JEjRzRp0iSdOHHCtZWXl+uqq67Spk2bdOzYsSrP+cUXX6i0tFS33HKLW/vQoUOrfau+Nq8bcCqCGs3CsWPHdPjwYYWHh1fbp3v37lqzZo2Cg4P1wAMPqHv37urevbv+8pe/1OhcNVkhHBoaWm3b4cOHa3Temjp8+HCVtVa8Rqefv0OHDm6PKxZYlZSUVHuO/Px8GWNqdJ5zdern1KmpqfLy8tKll14qSRo+fLikk59Tn/75dMWCwptvvlne3t5u27x582SM0ZEjR6o8Z0WtISEhlfZV1SbV7nUDTtXS0wUADWHVqlUqKyurcrHPqS677DJddtllKisr09dff62XX35Z06dPV0hIiG677bZzOldNFgKdumr59LaK/8G3atVKklRaWurWLy8v75zPU5UOHTro4MGDldp/+uknSVJQUNB5HV+S2rVrpxYtWtTLeS6//HJ5eXkpNTVVTqdTF198seuqPSAgQAMGDND69et15MgRtWzZ0hXiFed7+eWXq12Jf7bQPfXbAxVycnLOuAAOqC2uqNHkZWdna8aMGQoMDNTkyZPP6TleXl4aMmSIXnnlFUlyvQ1d11dDu3bt0rZt29zali1bJn9/f1188cWS5Pqf//bt2936rVy5stLxnE7nOdc2cuRIrVu3zhWYFd566y35+fnVydfJWrdurSFDhmj58uVudZWXl+vtt99Wp06d1KNHj1odOzAwUAMHDnRdUZ/+j7C4uDitX79eqampuuSSS1whfumll6pt27b65ptvFBMTU+Xm4+NT5TmHDBkip9PpWgRX4Ysvvjivt7JrMm9ofriiRpOyc+dO12eOubm5+uyzz7RkyRJ5eXlpxYoVrq9XVeW1117TunXrNGbMGEVEROjXX3/VG2+8IUkaNWqUJMnf319dunTRRx99pJEjR6p9+/YKCgqq9ZVUeHi4rrvuOiUmJiosLExvv/22UlJSNG/ePPn5+UmSBg8erIsuukgzZszQiRMn1K5dO61YsUIbNmyodLy+fftq+fLlWrRokQYNGqQWLVooJiamynPPnj1b//rXvzRixAg99dRTat++vf7nf/5Hq1at0vz58xUYGFirMZ0uKSlJo0eP1ogRIzRjxgz5+Pjo1Vdf1c6dO/XOO++c11eRRowYoeeff14Oh0Pz5s1z2xcXF6cXX3xRxhhNmDDB1d6mTRu9/PLLmjRpko4cOaKbb75ZwcHBOnTokLZt26ZDhw5p0aJFVZ6vffv2evjhh5WUlKR27drpxhtv1A8//KA5c+YoLCxMLVrU7tqnJvOGZsiza9mAulGxOrdi8/HxMcHBwSYuLs7MnTvX5ObmVnrO6Sux09PTzY033mi6dOlinE6n6dChg4mLizMrV650e96aNWvMwIEDjdPpNJLMpEmT3I536NChs57LmJOrvseMGWPef/9906dPH+Pj42O6du1qFixYUOn53377rUlISDABAQGmY8eOZurUqWb
"text/plain": [
"<Figure size 500x500 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.displot(x=df.weight)\n",
"plt.title('Distribution of Weight')\n",
"plt.show();"
]
},
{
"cell_type": "markdown",
"id": "b8bbd1af-1c95-4fde-916e-e757f95b2f04",
"metadata": {},
"source": [
"Weight is a major player in fuel economy as it takes more energy to move a heavy car. An inefficient engine moving less weight than a highly efficient engine can end up burning more fuel but generally less weight means higher mpg. Most cars here are around 2000lbs, which makes sense because that's about the weight of a typical commuter/economy car from the 70s. They didn't have as much stuff packed into the interior that we have today so they're lighter. That's why some of these MPG numbers may seem high, but they're real"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "762d3940-4f1e-40e1-9fef-67073f33791b",
"metadata": {
"execution": {
"iopub.execute_input": "2022-08-01T00:19:01.256902Z",
"iopub.status.busy": "2022-08-01T00:19:01.256753Z",
"iopub.status.idle": "2022-08-01T00:19:01.325922Z",
"shell.execute_reply": "2022-08-01T00:19:01.325506Z",
"shell.execute_reply.started": "2022-08-01T00:19:01.256887Z"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAg0AAAHFCAYAAABxS8rQAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAi4ElEQVR4nO3deXRU5f3H8c8NWYEQEQNJMAIuKJRVEAGpCeKGaF1OW6vWg6KtIAHcrVJLPEcPtlardrO2FsXWqlVBXFBcEiwmIAKRRUXQAGkhoBEhEROBfH9/+Mstk0W+1MCEyft1Ts4J99658zzzGObtzB0SmJkJAABgL+KiPQAAAHBwIBoAAIAL0QAAAFyIBgAA4EI0AAAAF6IBAAC4EA0AAMCFaAAAAC5EAwAAcCEagP/3yCOPKAiCiK/09HTl5ubqhRdeiPbwQt27d9dll122z7fbsWOH8vPzVVhY2OxjWrduncaMGaNDDz1UQRDommuuafS4Pn36qFevXg22z5o1S0EQaNiwYQ32PfbYYwqCQHPmzNmn8QRBoEceecR9mz0FQaC8vLy9HldUVKT8/Hx9/vnn/9P9AAcbogGoZ8aMGSouLlZRUZEeeughtWnTRuecc46ef/75aA/tW9mxY4duv/32/RIN1157rRYtWqS//vWvKi4u1rXXXtvocSNHjtQHH3yg8vLyiO2FhYVq166d3nnnHVVWVjbYFxcXp5NPPtk9nszMTBUXF2vMmDH7Ppl9UFRUpNtvv51oQKtBNAD19OnTR0OHDtWwYcN0/vnn64UXXlBSUpL+8Y9/RHtoLdbKlSs1ZMgQnXfeeRo6dKi6devW6HEjR46UpAbhUlhYqCuvvFJBEGjBggUN9g0cOFCHHHKIezxJSUkaOnSo0tPT92keAL4Z0QDsRXJyshITE5WQkBCx/bPPPtPVV1+trl27KjExUUceeaSmTp2qmpoaSVJ1dbUGDhyoo48+Wtu2bQtvV15eroyMDOXm5mr37t2SpMsuu0zt27fXqlWrNGrUKLVr107p6enKy8vTjh079jrGDRs26Mc//rE6d+6spKQk9erVS/fcc49qa2slff1yfd0T6O233x6+/bK3tzn2dt7CwkIFQaC1a9dq7ty54XnXrVvX6Plyc3MVBEFENFRUVGjFihUaM2aMBg0apIKCgnBfWVmZPv744zA2JGnNmjW6+OKLI8b0+9//PuJ+mnp74rnnnlO/fv2UlJSkI488Uvfff7/y8/MVBEGj433sscfUq1cvtW3bVv379494myo/P1833nijJKlHjx7h3PfHKzlAi2EAzMxsxowZJskWLlxoO3futK+++srKysps8uTJFhcXZy+//HJ47Jdffmn9+vWzdu3a2a9//WubN2+e3XbbbRYfH29nnXVWeNyHH35oqampdsEFF5iZ2e7du+2UU06xzp0728aNG8Pjxo4da4mJiXbEEUfYnXfeafPmzbP8/HyLj4+3s88+O2Kc3bp1s7Fjx4Z/3rJli3Xt2tXS09PtwQcftJdfftny8vJMkk2YMMHMzKqrq+3ll182SXbFFVdYcXGxFRcX29q1a5t8PDzn3bZtmxUXF1tGRoaddNJJ4Xmrq6ubPG///v2tZ8+e4Z+feeYZi4+Pt6qqKrv55pvthBNOCPc9+uijJslefPFFMzNbtWqVpaWlWd++fW3mzJk2b948u/766y0uLs7y8/PD25WWlpokmzFjRrht7ty5FhcXZ7m5uTZr1iz75z//aSeeeKJ1797d6v9VKMm6d+9uQ4YMsaeeespeeukly83Ntfj4ePvoo4/MzKysrMwmTZpkkuzZZ58N575t27Ym5w4c7IgG4P/VRUP9r6SkJPvDH/4QceyDDz5okuypp56K2P7LX/7SJNm8efPCbU8++aRJsvvuu89+8YtfWFxcXMR+s6+jQZLdf//9EdvvvPNOk2QLFiwIt9WPhp/97GcmyRYtWhRx2wkTJlgQBLZ69WozM/vkk09Mkk2bNs31eHjPWzemMWPGuM57zTXXmKQwmiZNmmRDhw41M7OXXnrJ2rRpEz7xXn755damTRvbvn27mZmdccYZdvjhhzd4Ys7Ly7Pk5GT77LPPzKzxaDjhhBMsOzvbampqwm2VlZXWqVOnRqOhS5cu4f2amZWXl1tcXJxNnz493Hb33XebJCstLXXNHTjY8fYEUM/MmTO1ePFiLV68WHPnztXYsWM1ceJE/e53vwuPeeONN9SuXTt9//vfj7ht3cv9r7/+erjthz/8oSZMmKAbb7xRd9xxh2699Vaddtppjd73JZdcEvHniy++WJIiXrKv74033lDv3r01ZMiQBmMxM73xxht7n/QBPG/96xoKCwuVm5srSRoxYoQk6c033wz3DR48WKmpqaqurtbrr7+u888/X23bttWuXbvCr7POOkvV1dVauHBho/f5xRdf6J133tF5552nxMTEcHv79u11zjnnNDnO1NTU8M9dunRR586dtX79+v9p3kAsIBqAenr16qXBgwdr8ODBOvPMM/WnP/1Jp59+um666abwKvmKigplZGQ0eC+8c+fOio+PV0VFRcT2cePGaefOnYqPj9fkyZMbvd/4+Hh16tQpYltGRkZ4f02pqKhQZmZmg+1ZWVl7ve032V/nzcnJUVxcnAoKClRRUaGVK1cqJydHkpSamqqBAweqsLBQGzZsUGlpaRgZFRUV2rVrl377298qISEh4uuss86SJH366aeN3ufWrVtlZurSpUuDfY1tk9RgLaSvL7D88ssv/6d5A7EgPtoDAA4G/fr10yuvvKIPP/xQQ4YMUadOnbRo0SKZWUQ4bNmyRbt27dJhhx0Wbvviiy906aWXqmfPntq8ebOuvPJKPffccw3uY9euXaqoqIh4sqr7aGJjT2B1OnXqpE2bNjXYvnHjRkmKGMu+2F/nTUtLC8Og7uOUJ510Urg/JydHBQUF6tu3r6T/vjLRsWNHtWnTRpdeeqkmTpzY6Ll79OjR6PaOHTsqCAJt3ry5wb76H/8E0DReaQAcSkpKJCn8BMKoUaNUVVWl2bNnRxw3c+bMcH+d8ePHa8OGDXr22Wf18MMPa86cOfrNb37T6P38/e9/j/jz448/Lknhy/eNGTVqlN577z0tXbq0wViCIAifdJOSkiTJ/X/K3vP+L0aOHKk1a9bo8ccf16BBgyLeBsjJyVFJSYlmz56thISEMCjatm2rkSNHatmyZerXr1/4atCeX03FVbt27TR48GDNnj1bX331Vbi9qqrqW/3DXfv6mAIHveheUgG0HHUXQs6YMSO8Ev6FF16wcePGmSQ7//zzw2PrPj2Rmppq9957r7366qs2bdo0S0hIiPj0xJ///OcGF+Tl5eVZQkJCxAWG3/TpidGjR0eMs6lPT2RkZNhDDz1kr7zyik2ePNmCILCrr766wW2PPfZYe+WVV2zx4sXfeAHfvp7XeyGkmdmLL75okiwIArvxxhsj9m3dutXi4uIsCAI76aSTIvatWrXKOnbsaEOGDLEZM2ZYQUGBzZkzx+69914bOXJkeJzn0xNPP/20nXjiidatWzcLgiDifiTZxIkTG4y7/mNfUFBgkuyqq66yoqIiW7x4ccTFk0CsIRqA/9fYpyfS0tJswIABdu+99zb4GGFFRYWNHz/eMjMzLT4+3rp162a33HJLeNzy5cstJSUl4knG7OuPPw4aNMi6d+9uW7duNbOvo6Fdu3a2fPlyy83NtZSUFDv00ENtwoQJVlVVFXH7+k9cZmbr16+3iy++2Dp16mQJCQl27LHH2t133227d++OOO61116zgQMHWlJSkklqcJ76vOfd12jYvn27xcfHmyR74YUXGuwfMGCASbKpU6c22FdaWmrjxo2zrl27WkJCgqWnp9vw4cPtjjvuiDimfjSYmc2aNcv69u0bBtpdd91lkydPto4dO0Yc540GM7NbbrnFsrKyLC4uziRZQUGB+3EADjaBmdkBf3kDQITLLrtMTz/9tKqqqqI9lFZl586dGjBggLp27ap58+ZFezhAi8eFkABajSuuuEKnnXaaMjMzVV5ergcffFDvv/++7r///mgPDTg
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.boxplot(x=df.weight)\n",
"plt.title('Boxplot of Weight')\n",
"plt.show();"
]
},
{
"cell_type": "markdown",
"id": "6cfba8d4-ac12-44bb-8917-22ed65f513b2",
"metadata": {},
"source": [
"Nothing out of the ordinary here"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "acd58565-0abe-4245-b505-81fa56a0e106",
"metadata": {
"execution": {
"iopub.execute_input": "2022-08-01T00:19:01.326907Z",
"iopub.status.busy": "2022-08-01T00:19:01.326552Z",
"iopub.status.idle": "2022-08-01T00:19:01.473926Z",
"shell.execute_reply": "2022-08-01T00:19:01.473358Z",
"shell.execute_reply.started": "2022-08-01T00:19:01.326892Z"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAH+CAYAAABTKk23AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA8uUlEQVR4nO3de1yUZf7/8fd4Gg4CeWQgBakINU+VflXKxUNSluZmbZlpuB1WV7PM7We5bol9WyzbzC3NshS11qw2NXctE/PQATXyUGpoJxUriSATFASF6/dHP+bnyEHBgbnQ1/PxmMejue7r/szn9p7pzT1zz9wOY4wRAACwUj1fNwAAACpGUAMAYDGCGgAAixHUAABYjKAGAMBiBDUAABYjqAEAsBhBDQCAxQhqAAAsRlCj1ixYsEAOh8N98/Pzk8vlUp8+fTRt2jRlZWWVWScxMVEOh6NKj5Ofn6/ExEStX7++SuuV91ht2rTRwIEDq1TndBYvXqyZM2eWu8zhcCgxMdGrj+dtH3zwgbp27arAwEA5HA4tX778tOvs2LFDDodDDRs21MGDB2u+yTNQneeWN6WmpioxMVG//vprmWW9e/dW7969a70n2ImgRq1LTk7Wxo0blZKSotmzZ6tLly566qmn1K5dO61Zs8Zj7j333KONGzdWqX5+fr6mTp1a5aCuzmNVR2VBvXHjRt1zzz013kN1GWN06623qmHDhlqxYoU2btyouLi40673yiuvSJJOnDihRYsW1XSbdUJqaqqmTp1ablC/8MILeuGFF2q/KVipga8bwPmnQ4cO6tq1q/v+zTffrAcffFBXX321hgwZoq+//lqhoaGSpFatWqlVq1Y12k9+fr4CAgJq5bFOp0ePHj59/NP58ccf9csvv+imm25Sv379zmidwsJC/etf/1Lnzp2VnZ2t+fPn6+GHH67hTmtf6fPIG9q3b++VOjg3cEQNK0REROiZZ55RXl6eXnrpJfd4eW9Prl27Vr1791azZs3k7++viIgI3XzzzcrPz9e+ffvUokULSdLUqVPdb7OPHDnSo97WrVt1yy23qEmTJrr44osrfKxSy5YtU6dOneTn56eLLrpIzz33nMfy0rf19+3b5zG+fv16ORwO99F97969tXLlSu3fv9/jY4BS5b31vXPnTg0ePFhNmjSRn5+funTpooULF5b7OK+//romT56s8PBwBQcH65prrtGePXsq/oc/yccff6x+/fopKChIAQEBio2N1cqVK93LExMT3X/IPPzww3I4HGrTps1p6y5fvlw5OTm65557lJCQoK+++koff/xxmXmFhYV6/PHH1a5dO/n5+alZs2bq06ePUlNT3XNKSkr0/PPPq0uXLvL399cFF1ygHj16aMWKFR613njjDfXs2VOBgYFq3Lixrr32Wm3btu2M/h3OZN2RI0eqcePG2rFjh+Lj4xUUFOT+wyUlJUWDBw9Wq1at5Ofnp0suuUSjRo1Sdna2x7/l//k//0eSFBUV5X4enPw8OfWt719++UVjxozRhRdeqEaNGumiiy7S5MmTVVhY6DHP4XDovvvu06uvvqp27dopICBAnTt31n//+98z2n7Yh6CGNa6//nrVr19fH374YYVz9u3bpxtuuEGNGjXS/PnztWrVKj355JMKDAxUUVGRwsLCtGrVKknS3XffrY0bN2rjxo169NFHPeoMGTJEl1xyid566y29+OKLlfa1fft2jR8/Xg8++KCWLVum2NhYPfDAA/rHP/5R5W184YUXdNVVV8nlcrl7q+zt9j179ig2Nla7du3Sc889p6VLl6p9+/YaOXKkpk+fXmb+X//6V+3fv1+vvPKK5s6dq6+//lqDBg1ScXFxpX1t2LBBffv21eHDhzVv3jy9/vrrCgoK0qBBg/TGG29I+u2jgaVLl0qSxo0bp40bN2rZsmWn3eZ58+bJ6XTqjjvu0F133SWHw6F58+Z5zDlx4oQGDBig//3f/9XAgQO1bNkyLViwQLGxscrIyHDPGzlypB544AF169ZNb7zxhpYsWaIbb7zR4w+kpKQk3X777Wrfvr3efPNNvfrqq8rLy1OvXr305ZdfVtprVdYtKirSjTfeqL59++qdd97R1KlTJUnffvutevbsqTlz5mj16tV67LHHtHnzZl199dU6fvy4+99y3LhxkqSlS5e6nwdXXHFFuX0dO3ZMffr00aJFizRhwgStXLlSw4cP1/Tp0zVkyJAy81euXKlZs2bp8ccf19tvv62mTZvqpptu0nfffVfp9sNSBqglycnJRpJJS0urcE5oaKhp166d+/6UKVPMyU/Tf//730aS2b59e4U1fv75ZyPJTJkypcyy0nqPPfZYhctOFhkZaRwOR5nH69+/vwkODjZHjx712La9e/d6zFu3bp2RZNatW+ceu+GGG0xkZGS5vZ/a99ChQ43T6TQZGRke8wYMGGACAgLMr7/+6vE4119/vce8N99800gyGzduLPfxSvXo0cO0bNnS5OXlucdOnDhhOnToYFq1amVKSkqMMcbs3bvXSDJPP/10pfVK7du3z9SrV88MHTrUPRYXF2cCAwNNbm6ue2zRokVGknn55ZcrrPXhhx8aSWby5MkVzsnIyDANGjQw48aN8xjPy8szLpfL3Hrrre6xU/d3VdZNSEgwksz8+fMr2XpjSkpKzPHjx83+/fuNJPPOO++4lz399NPlPmeM+e3fKC4uzn3/xRdfNJLMm2++6THvqaeeMpLM6tWr3WOSTGhoqMe/b2ZmpqlXr56ZNm1apf3CThxRwyrmNJdH79Klixo1aqQ//elPWrhwYbWPEG6++eYznnvZZZepc+fOHmPDhg1Tbm6utm7dWq3HP1Nr165Vv3791Lp1a4/xkSNHKj8/v8zR+I033uhxv1OnTpKk/fv3V/gYR48e1ebNm3XLLbeocePG7vH69etrxIgR+v7778/47fNTJScnq6SkRHfddZd77K677tLRo0fdR+qS9N5778nPz89j3qnee+89SdLYsWMrnPP+++/rxIkTuvPOO3XixAn3zc/PT3FxcZWeYFiddct7HmVlZWn06NFq3bq1GjRooIYNGyoyMlKSlJ6eXuHjV2bt2rUKDAzULbfc4jFe+pHOBx984DHep08fBQUFue+HhoaqZcuWlT4PYC9OJoM1jh49qpycHHXs2LHCORdffLHWrFmj6dOna+zYsTp69Kguuugi3X///XrggQfO+LHCwsLOeK7L5apwLCcn54zrVEdOTk65vYaHh5f7+M2aNfO473Q6JUkFBQUVPsahQ4dkjKnS45yJkpISLViwQOHh4bryyivdZzdfc801CgwM1Lx589xnuP/8888KDw9XvXoVHzv8/PPPql+/frn7o9RPP/0kSerWrVu5yyurX9V1AwICFBwc7DFWUlKi+Ph4/fjjj3r00UfVsWNHBQYGqqSkRD169Kh0P1QmJydHLperzDkULVu2VIMGDU77PJB+ey5U9/HhWwQ1rLFy5UoVFxef9vujvXr1Uq9evVRcXKzPPvtMzz//vMaPH6/Q0FANHTr0jB6rKt+fzczMrHCs9H+Ifn5+klTmxJ6TTyCqjmbNmpX7veMff/xRktS8efOzqi9JTZo0Ub169bz+OGvWrHEfwZUXHJs2bdKXX36p9u3bq0WLFvr4449VUlJSYZi2aNFCxcXFyszMrPAPrdI+//3vf7uPYs9UVdct7zm0c+dOff7551qwYIESEhLc4998802VejlVs2bNtHnzZhljPB43KytLJ06c8MrzAPbirW9YISMjQw899JBCQkI0atSoM1qnfv366t69u2bPni1J7rehz+Qosip27dqlzz//3GNs8eLFCgoKcp/8U3r28xdffOEx79SzkUv7O9Pe+vXrp7Vr17oDs9SiRYsUEBDgla9zBQYGqnv37lq6dKlHXyUlJXrttdfUqlUrXXrppVWuO2/ePNWrV0/Lly/XunXrPG6vvvqqJGn+/PmSpAEDBujYsWNasGBBhfUGDBggSZozZ06Fc6699lo1aNBA3377rbp27VrurSbWLVUaoqXPwVInf5OhVFWep/369dORI0fK/Lh
"text/plain": [
"<Figure size 500x500 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.displot(x=df.acceleration)\n",
"plt.title('Distribution of Acceleration')\n",
"plt.show();"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "dd0e9f92-fe24-4de5-ba3c-e0cdb9767e0d",
"metadata": {
"execution": {
"iopub.execute_input": "2022-08-01T00:19:01.474821Z",
"iopub.status.busy": "2022-08-01T00:19:01.474614Z",
"iopub.status.idle": "2022-08-01T00:19:01.558118Z",
"shell.execute_reply": "2022-08-01T00:19:01.557560Z",
"shell.execute_reply.started": "2022-08-01T00:19:01.474806Z"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAggAAAHFCAYAAACXYgGUAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAp3ElEQVR4nO3deXQUZaL+8aeTkJUQJBCzEAIqBJBFJKBsEy6MbIIi3otKhAQRjiA6go6iM0BwBRnwOuM44BWDjOC4sIgiCMoyamImyqoTQBxWgyJRCFsCSb+/P/yloXnT2UjoQL6fc/ocuurd+u3qqidV1bTDGGMEAABwDh9vDwAAANQ8BAQAAGAhIAAAAAsBAQAAWAgIAADAQkAAAAAWAgIAALAQEAAAgIWAAAAALAQE1Fjz58+Xw+FwezRq1Eg9e/bUBx984O3huTRt2lQpKSkVrnfy5EmlpqZq/fr1VT6mPXv26Oabb1aDBg3kcDj00EMPlVnnzJkzioyMlMPh0LvvvlvlY6qM9evXy+FwVMsclUdOTo5SU1O1efNma11qaqocDsfFHxRwkRAQUOOlpaUpIyND6enpeuWVV+Tr66tBgwbp/fff9/bQLsjJkyc1bdq0ajn4TZgwQZmZmXrttdeUkZGhCRMmlFnngw8+0I8//ihJmjdvXpWP6VKUk5OjadOmlRgQ7r33XmVkZFz8QQEXiZ+3BwCUpU2bNkpISHA979evn6644gq9+eabGjRokBdHVnN9/fXX6ty5swYPHlzuOvPmzZO/v78SExO1evVqHThwQI0bN66+QXrBqVOnFBgYWCV/+Tdu3Piymx/gXJxBwCUnMDBQ/v7+qlOnjtvyn3/+WePGjVNMTIz8/f111VVX6Q9/+IMKCgokSfn5+erQoYOuueYaHT161FXvhx9+UGRkpHr27KmioiJJUkpKiurWratvvvlGvXv3VkhIiBo1aqTx48fr5MmTZY5x3759uvvuuxUREaGAgAC1atVKs2bNktPplPTrJYBGjRpJkqZNm+a6hFLWpYqy2i0+Jb9r1y6tXLnS1e6ePXtKbTcnJ0erVq3SoEGD9Pvf/15Op1Pz588vseyiRYvUpUsX1a1bV3Xr1tV1111nnXFYtWqVevfurbCwMAUHB6tVq1Z67rnn3Mp8+eWXuuWWW9SgQQMFBgaqQ4cOevvtt0sdZ0XqFl+iWr16te655x41atRIwcHBKigo0K5duzRy5Eg1b95cwcHBiomJ0aBBg7Rt2zZX/fXr16tTp06SpJEjR7rmMjU1VVLJlxicTqeef/55tWzZUgEBAYqIiNCIESN04MABt3I9e/ZUmzZtlJWVpR49eig4OFhXXXWVpk+f7novAa8zQA2VlpZmJJkvvvjCnDlzxpw+fdrs37/fPPjgg8bHx8esWrXKVfbUqVOmXbt2JiQkxPzpT38yq1evNpMnTzZ+fn5mwIABrnI7d+40oaGhZsiQIcYYY4qKikyvXr1MRESEycnJcZVLTk42/v7+pkmTJuaZZ54xq1evNqmpqcbPz88MHDjQbZxxcXEmOTnZ9fzQoUMmJibGNGrUyMyZM8esWrXKjB8/3kgyY8eONcYYk5+fb1atWmUkmVGjRpmMjAyTkZFhdu3a5XE+ytPu0aNHTUZGhomMjDTdunVztZufn1/qXD/zzDNGklmxYoVxOp0mLi7ONGvWzDidTrdykydPNpLMkCFDzDvvvGNWr15tZs+ebSZPnuwq8+qrrxqHw2F69uxpFi1aZD7++GPz8ssvm3HjxrnKrF271vj7+5sePXqYt956y6xatcqkpKQYSSYtLc1Vbt26dUaSWbduXYXrFm8/MTExZsyYMWblypXm3XffNYWFhWbDhg3m4YcfNu+++67ZsGGDWbp0qRk8eLAJCgoy27dvd81lcRt//OMfXXO5f/9+Y4wxU6dONefvQseMGWMkmfHjx5tVq1aZOXPmmEaNGpnY2Fjz008/ucolJiaa8PBw07x5czNnzhyzZs0aM27cOCPJvP7666W+V8DFQkBAjVW8cz7/ERAQYF5++WW3snPmzDGSzNtvv+22fMaMGUaSWb16tWvZW2+9ZSSZ//3f/zVTpkwxPj4+buuN+TUgSDIvvvii2/LiA+lnn33mWnZ+QJg0aZKRZDIzM93qjh071jgcDrNjxw5jjDE//fSTkWSmTp1arvkob7vFY7r55pvL1a7T6TTXXHONiYmJMYWFhcaYswe/Tz75xFXuP//5j/H19TVJSUke2zp27JipV6+e6d69uxUuztWyZUvToUMHc+bMGbflAwcONFFRUaaoqMgYU3JAKG/d4u1nxIgRZc5BYWGhOX36tGnevLmZMGGCa3lWVpYVPIqdHxCys7ONJLcgZIwxmZmZRpJ54oknXMsSExNLfC9bt25t+vbtW+Z4gYuBSwyo8RYsWKCsrCxlZWVp5cqVSk5O1v3336+XXnrJVWbt2rUKCQnRf//3f7vVLT5l/8knn7iWDR06VGPHjtXvf/97Pf3003riiSd00003ldh3UlKS2/Nhw4ZJktatW+dxvGvXrlXr1q3VuXNnayzGGK1du7bsF30R292wYYN27dql5ORk+fr6Sjp7Sv21115zlVuzZo2Kiop0//33e2wrPT1deXl5GjdunMfr/Lt27dL27dtdc1tYWOh6DBgwQAcPHtSOHTuqrO7tt99utVNYWKhnn31WrVu3lr+/v/z8/OTv769vv/1W2dnZpcyWZ8XbxPmXiTp37qxWrVq5bYOSFBkZab2X7dq10969eyvVP1DVCAio8Vq1aqWEhAQlJCSoX79+mjt3rvr06aNHH31UR44ckSTl5ua6vqJ3roiICPn5+Sk3N9dt+T333KMzZ87Iz89PDz74YIn9+vn5KTw83G1ZZGSkqz9PcnNzFRUVZS2Pjo4us25pqqvd4vsHbrvtNh05ckRHjhxRWFiYunfvrsWLF7vm+KeffpKkUm/MK0+Z4m9KPPLII6pTp47bY9y4cZKkw4cPV1ndkuZs4sSJmjx5sgYPHqz3339fmZmZysrKUvv27XXq1CmPYy9N8fx7eo/Of3/O37YkKSAgoNL9A1WNbzHgktSuXTt99NFH2rlzpzp37qzw8HBlZmbKGOMWEg4dOqTCwkI1bNjQtezEiRMaPny4WrRooR9//FH33nuv3nvvPauPwsJC5ebmuu3If/jhB0kl79yLhYeH6+DBg9bynJwcSXIbS0VUR7tHjx7V4sWLJcl1Q975Fi1apHHjxrluqjxw4IBiY2NLLHtuGU+Kx/n4449ryJAhJZaJj4+vsrolncl44403NGLECD377LNuyw8fPqz69et7HHtpireJgwcPWgEpJyen0u874C2cQcAlqfh76cUHpN69e+v48eNatmyZW7kFCxa41he77777tG/fPi1ZskTz5s3T8uXL9cILL5TYz8KFC92eL1q0SNKvd6F70rt3b/373//Wxo0brbE4HA7913/9l6Rf/1qUVO6/GMvbbkUsWrRIp06d0lNPPaV169ZZj4YNG7ouM/Tp00e+vr7629/+5rG9rl27KiwsTHPmzJExpsQy8fHxat68ubZs2eI6M3T+IzQ0tMrrnsvhcLjmv9iKFSv0/fffuy2ryHvUq1cvSb+Gj3NlZWUpOzvbbRsELgWcQUCN9/XXX6uwsFDSr6dxlyxZojVr1ui2225Ts2bNJEkjRozQX//6VyUnJ2vPnj1q27atPvvsMz377LMaMGCAfvvb30qSXn31Vb3xxhtKS0vTtddeq2uvvVbjx4/XY489pm7durldE/b399esWbN0/PhxderUSenp6Xr66afVv39/de/e3eN4J0yYoAULFujmm2/Wk08+qbi4OK1YsUIvv/yyxo4dqxYtWkiSQkNDFRcXp/fee0+9e/dWgwYN1LBhQzVt2vSC2q2IefPm6YorrtAjjzyiwMBAa/2IESM0e/ZsbdmyRe3bt9cTTzyhp556SqdOndJdd92lsLAw/fvf/9bhw4c1bdo01a1bV7NmzdK9996r3/72txo9erSuvPJK7dq1S1u2bHHdNzJ37lz1799fffv2VUpKimJiYvTzzz8
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.boxplot(x=df.acceleration);\n",
"plt.title('Boxplot of Acceleration')\n",
"plt.show();"
]
},
{
"cell_type": "markdown",
"id": "1087d45a-dc2f-47f9-be25-f1f28669c2ad",
"metadata": {},
"source": [
"I'm not even sure what acceleration is supposed to be. I assume probably it's 0-60mph time in seconds.. While it's interesting I think it contributes nothing to calculating MPG and looks a bit less than ideal anyway. Everything is close to the same value and there are some outliers"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "291e8653-3d29-4fb5-8105-2fea779cc5ed",
"metadata": {
"execution": {
"iopub.execute_input": "2022-08-01T00:19:01.559014Z",
"iopub.status.busy": "2022-08-01T00:19:01.558801Z",
"iopub.status.idle": "2022-08-01T00:19:01.693534Z",
"shell.execute_reply": "2022-08-01T00:19:01.692969Z",
"shell.execute_reply.started": "2022-08-01T00:19:01.558999Z"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAH+CAYAAABTKk23AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA5QklEQVR4nO3deVxVdf7H8fcV4QKy5MqibJmSaWplg1IGpWKaWtlubqWNjWaSU5rZJDYOpjMZM5mV80uyR5lNozZOi0luLVihjrkMmhUumUSYCQqBwvf3Rw/ueAUXCLhf9PV8PM4j7/d87zmfL5fOm7Pd4zDGGAEAACs18nQBAADg1AhqAAAsRlADAGAxghoAAIsR1AAAWIygBgDAYgQ1AAAWI6gBALAYQQ0AgMUIaljh5ZdflsPhcE2+vr4KDQ3Vtddeq5kzZyovL6/Se1JSUuRwOKq1nqKiIqWkpGjt2rXVel9V64qOjtaAAQOqtZwzWbRokdLS0qqc53A4lJKSUqvrq22rVq1St27d1KRJEzkcDr311ltV9tu9e7frsz7VmO69915Xn9qUmJioxMTEGr03OjpaI0eOPOX8Tz/9VI0bN9bvf//7KuenpqbK4XBoxYoVNVo/zk8ENaySnp6u9evXKyMjQ88995y6du2qWbNmqUOHDvrggw/c+o4ePVrr16+v1vKLioo0ffr0agd1TdZVE6cL6vXr12v06NF1XkNNGWN0++23y9vbW8uXL9f69euVkJBw2vcEBgbq5ZdfVnl5uVv7kSNH9OabbyooKKguS6513bt31+TJk5WWlqaPP/7Ybd62bds0ffp0jRkzRtdff72HKkRDRFDDKp06dVL37t3Vs2dP3XLLLXrmmWe0ZcsWNWnSRIMHD9b333/v6tumTRt17969TuspKiqqt3WdSffu3dWmTRuP1nA63333nX788UfdfPPN6tWrl7p3766mTZue9j133HGH9uzZo1WrVrm1v/HGGyorK9OgQYPqsuQ6MW3aNF166aUaOXKk6/fn+PHjGjlypNq0aaO//OUv9VJHxbrR8BHUsF5kZKSefvppFRYW6sUXX3S1V3U4evXq1UpMTFTz5s3l5+enyMhI3XLLLSoqKtLu3bvVsmVLSdL06dNdh1UrDmVWLG/Tpk269dZb1bRpU7Vt2/aU66qwbNkyde7cWb6+vrrwwgv1t7/9zW1+xWH93bt3u7WvXbtWDofDtXefmJiod955R3v27HE7DVChqsPE27Zt04033qimTZvK19dXXbt21cKFC6tcz+uvv66pU6cqPDxcQUFB6t27t3bu3HnqH/wJPv74Y/Xq1UuBgYHy9/dXfHy83nnnHdf8lJQU1x8RkydPlsPhUHR09BmXGxsbq/j4eC1YsMCtfcGCBRo8eLCCg4Mrvae8vFyzZ8/WxRdfLKfTqVatWmn48OH69ttv3foZYzR79mxFRUXJ19dXl19+ud57770q6ygoKNDDDz+smJgY+fj4qHXr1kpOTtbRo0fPOIaT+fj46JVXXtG+ffs0efJkSdLMmTP1n//8Ry+//LICAgJUWlqqGTNmuMbQsmVL3XPPPfrhhx/clvXGG28oKSlJYWFh8vPzU4cOHfToo49WqmvkyJEKCAjQ1q1blZSUpMDAQPXq1avatcNOjT1dAHA2+vfvLy8vL3344Yen7LN7927dcMMN6tmzpxYsWKALLrhA+/fv14oVK1RaWqqwsDCtWLFC119/vUaNGuU6jFwR3hUGDx6sO++8U/fff/8ZN9SbN29WcnKyUlJSFBoaqtdee00TJkxQaWmpHn744WqNcd68efrtb3+rr7/+WsuWLTtj/507dyo+Pl6tWrXS3/72NzVv3lyvvvqqRo4cqe+//16TJk1y6//YY4/pqquu0v/93/+poKBAkydP1sCBA5WdnS0vL69TrmfdunXq06ePOnfurJdeeklOp1Pz5s3TwIED9frrr+uOO+7Q6NGj1aVLFw0ePFjjx4/XkCFD5HQ6z2rco0aN0rhx43To0CE1bdpUO3fuVGZmpmbMmKElS5ZU6v+73/1O8+fP1wMPPKABAwZo9+7d+sMf/qC1a9dq06ZNatGihaRf/hibPn26Ro0apVtvvVX79u3Tfffdp7KyMsXGxrqWV1RUpISEBH377bd67LHH1LlzZ23fvl1PPPGEtm7dqg8++KDa58k7d+6s6dOn67HHHtNFF12kP/7xj5o4caJ69uyp8vJy3Xjjjfroo480adIkxcfHa8+ePZo2bZoSExO1YcMG+fn5SZJ27dql/v37Kzk5WU2aNNGOHTs0a9Ysff7551q9erXbOktLSzVo0CCNGTNGjz76qI4fP16tmmExA1ggPT3dSDJZWVmn7BMSEmI6dOjgej1t2jRz4q/wP//5TyPJbN68+ZTL+OGHH4wkM23atErzKpb3xBNPnHLeiaKioozD4ai0vj59+pigoCBz9OhRt7Hl5OS49VuzZo2RZNasWeNqu+GGG0xUVFSVtZ9c95133mmcTqfZu3evW79+/foZf39/89NPP7mtp3///m79/vGPfxhJZv369VWur0L37t1Nq1atTGFhoavt+PHjplOnTqZNmzamvLzcGGNMTk6OkWT+/Oc/n3Z5J/ctLCw0AQEBZu7cucYYYx555BETExNjysvLzbhx49x+7tnZ2UaSGTt2rNvyPvvsMyPJPPbYY8YYYw4dOmR8fX3NzTff7Nbvk08+MZJMQkKCq23mzJmmUaNGlX73Kn6f3n33XVdbVFSUGTFixBnHZ8wvP6MePXoYSaZjx47m559/NsYY8/rrrxtJZsmSJW79s7KyjCQzb968KpdXXl5ujh07ZtatW2ckmS+++MI1b8SIEUaSWbBgwVnVhoaFQ99oMMwZHp3etWtX+fj46Le//a0WLlyob775pkbrueWWW866b8eOHdWlSxe3tiFDhqigoECbNm2q0frP1urVq9WrVy9FRES4tVecGz354reTz/d27txZkrRnz55TruPo0aP67LPPdOuttyogIMDV7uXlpWHDhunbb78968PnpxIQEKDbbrtNCxYs0PHjx/XKK6/onnvuqXIvds2aNZJU6crr3/zmN+rQoYPrXPf69ev1888/6+6773brFx8fr6ioKLe2t99+W506dVLXrl11/Phx19S3b1+3UxPV5eXlpWnTpkn65WhGxRGGt99+WxdccIEGDhzotr6uXbsqNDTUbX3ffPONhgwZotDQUHl5ecnb29t1gV52dnaldVbndxcNB0GNBuHo0aM6ePCgwsPDT9mnbdu2+uCDD9SqVSuNGzdObdu2Vdu2bfXXv/61WusKCws7676hoaGnbDt48GC11ltdBw8erLLWip/Ryetv3ry52+uK4CguLj7lOg4dOiRjTLXWUxOjRo3Spk2b9Kc//Uk//PDDKW+BqljXqeqpmF/x39N9PhW+//57bdmyRd7e3m5TYGCgjDHKz8+v8bgqfsY+Pj5u6/vpp5/k4+NTaZ25ubmu9R05ckQ9e/bUZ599phkzZmjt2rXKysrS0qVLJVX+3Pz9/RvcVfI4O5yjRoPwzjvvqKys7Iz3v/bs2VM9e/ZUWVmZNmzYoGeffVbJyckKCQnRnXfeeVbrqs75yNzc3FO2VQSjr6+vJKmkpMSt368JgIrlHzhwoFL7d999J0muc7W/RtOmTdWoUaM6X89VV12l2NhYPfnkk+rTp0+lowQVKn6mBw4cqHQF/HfffeeqpaLfqT6fEy90a9Gihfz8/Cpd0Hbi/NrUokULNW/e/JT3UgcGBkr65YjJd999p7Vr17rd5vbTTz9V+b7avt8c9mCPGtbbu3evHn74YQUHB2vMmDFn9R4vLy/FxcXpueeekyTXYeiz2Yusju3bt+uLL75wa1u0aJECAwN1+eWXS5IrFLZs2eLWb/ny5ZWW53Q6z7q2Xr16uTbmJ3rllVfk7+9fK7eTNWnSRHFxcVq6dKlbXeXl5Xr11VfVpk0btW/f/levR5Ief/xxDRw48JRfFiJJ1113nSTp1VdfdWvPyspSdna260rn7t27y9fXV6+99ppbv8zMzEqH+gcMGKCvv/5azZs
"text/plain": [
"<Figure size 500x500 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.displot(x=df.model_year);\n",
"plt.title('Distribution of Model Year')\n",
"plt.show();"
]
},
{
"cell_type": "markdown",
"id": "f7addd67-2fba-4539-88e3-347164d3cfd7",
"metadata": {},
"source": [
"Model year interestingly has 3 peaks. I'm not going to use this as a feature, not because of this, but because the data only spans 12 years. Model year could be a great indicator of tech but it won't work in this case because there's just not enough data and no real leaps in technology were had in these years anyway. To make predictions on unseen data if the model year is outside 1970-1982 like in the training set then it'll throw the prediction wildly off"
]
},
2022-07-21 16:31:53 -04:00
{
"cell_type": "markdown",
"id": "042416c1-0e56-4269-96c8-6926392e11e7",
"metadata": {},
"source": [
"### Save"
]
},
{
"cell_type": "code",
2022-08-01 09:32:07 -04:00
"execution_count": 22,
2022-07-21 16:31:53 -04:00
"id": "b3b42cca-6960-4d06-b7c4-1570f09e9fe0",
"metadata": {
"execution": {
2022-08-01 09:32:07 -04:00
"iopub.execute_input": "2022-08-01T00:19:01.695975Z",
"iopub.status.busy": "2022-08-01T00:19:01.695735Z",
"iopub.status.idle": "2022-08-01T00:19:01.701574Z",
"shell.execute_reply": "2022-08-01T00:19:01.700997Z",
"shell.execute_reply.started": "2022-08-01T00:19:01.695960Z"
2022-07-21 16:31:53 -04:00
},
"tags": []
},
"outputs": [],
"source": [
"df.to_csv('data/clean.csv', index=False)"
]
},
{
"cell_type": "markdown",
"id": "59524851-efe5-4042-8eee-d67038a13a77",
"metadata": {},
"source": [
"[EDA](eda.ipynb)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}