subject

Suppose you are a data scientist at Machine Learning Pythons LCC. You would like to create and evaluate a few models to predict bike rental count for certain day. Then you need to select the model that performs the best on your dataset. To get credit for this question, you must use Python with Pandas, numpy, and Sklearn to implement/code the solution. The Bike sharing dataset is in the attached files of this assignment (bike_sharing. csv). For this question, you need to do the following: 1. Write a Python code to load the dataset into Pandas dataframe and then save it in an Excel file fi. e., bike_sharing. xlsx). The file must be saved in the same working directory (the same folder your code is saved in.)
2. Explore the data by printing information about the columns/variables of the dataset. Then, Plot the relationship between every two variables.
3. Split your dataset into two partitions so the training is 75% of the given dataset and 25% for the testing
4. Fit a linear regression model on the training split. The model must predict the bike rental count at certain day. You should include the following in your code:
a. Check for missing values (i. e., null and nan values) and show how many missing values in each column and how many missing values in each row of the dataset (Remember this step needs to be applied before splitting the dataset.)
b. Remove the rows with missing values from the dataset before using it as input to your model dataset (Remember this step needs to be applied before splitting the dataset.)
c. Scale your training dataset. (Remember: do not scale the target (label) variable. Moreover, this will be two sub-steps that are applied after you split the data to train and test. Stepi, Calculate the scale parameters using the train dataset. Step ii, Use the parameters you calculated at step i to scale both the training and the testing dataset.) After that, you must save your training and testing dataset to CSV files, each split in a file (es, bike_test. csv, bike_train. csv.)
d. Show the training mean squared error (MSE). i. e., the error of labeling the training dataset using the model.

ansver
Answers: 1

Another question on Computers and Technology

question
Computers and Technology, 22.06.2019 01:30
What kind of motivation is katrina showing? use the drop-down menu to complete the statement. katrina is using motivation because she is personally interested in learning more.
Answers: 2
question
Computers and Technology, 22.06.2019 02:00
Think about some of the most memorable and forgettable games ever created. they can be games that were discussed in this unit or otherwise. what are some of the consistent factors that made certain games memorable to you? what were some of the consistent factors that made certain games forgettable to you? why? explain.
Answers: 1
question
Computers and Technology, 22.06.2019 20:10
Assume that minutes is an int variable whose value is 0 or positive. write an expression whose value is "undercooked" or "soft-boiled" or "medium-boiled" or "hard-boiled" or "overcooked" based on the value of minutes. in particular: if the value of minutes is less than 2 the expression's value is "undercooked"; 2-4 would be a "soft-boiled", 5-7 would be "medium-boiled", 8-11 would be "hard-boiled" and 12 or more would be a "overcooked".
Answers: 1
question
Computers and Technology, 23.06.2019 01:00
Complete the sentence about a presentation delivery method
Answers: 2
You know the right answer?
Suppose you are a data scientist at Machine Learning Pythons LCC. You would like to create and evalu...
Questions
question
Mathematics, 25.01.2021 21:50
question
Mathematics, 25.01.2021 21:50
question
Mathematics, 25.01.2021 21:50
question
Mathematics, 25.01.2021 21:50
question
History, 25.01.2021 21:50
Questions on the website: 13722362