Город МОСКОВСКИЙ
00:58:38

Boston House Price Dataset Training and Testing in Python notebook using sklearn and Excel

Аватар
Python: Революционный путь кодирования
Просмотры:
22
Дата загрузки:
02.12.2023 12:43
Длительность:
00:58:38
Категория:
Обучение

Описание

https://sites.google.com/view/vinegarhill-datalabs/introduction-to-machine-learning/random-forest-and-ols

Excel Reproduction of OLS estimation for full sample training and testing
A standard approach in machine learning to determine accuracy involves training and testing. Here we apply this approach to the Boston House Price Dataset. Train/Test is a method to measure the accuracy of your model and establish/verify robustness for in-sample and then out-of-sample. The Python code employed from scikit learn:
# Splitting to training and testing data

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.3, random_state = 4)

applied a test train split. The Train/Test split divides the complete 506 rows of data into a 70/30 split. The Boston House Price dataset is divided into: a training set and a testing set. 70% for training, and 30% for testing. You train the model using the training set. You test the model using the testing set. Train the model means create the model which we initially do here using Ordinary Least Squares. Replete with colab and excel spreadsheet, we replicate the steps for splitting training and testing from the google colab. Each step is reproduced in the spreadsheet and R squares are replicated and presented as measures of accuracy. The excel Link can be found on the Vinegar Hill portal:
https://sites.google.com/view/vinegarhill-datalabs/introduction-to-machine-learning/random-forest-and-ols

Рекомендуемые видео