Example 2: Regression

Here’s an example of performing regression using the nnet package in R on a created dataset.

Step 1: Prepare the Data

In this step, a sample dataset for regression is created by generating random predictor variables and a response variable. The data is combined into a data frame, and the response variable is converted to a factor.

R




# Step 1: Prepare the Data
set.seed(123)
 
# Generating random data
n <- 200  # Number of observations
x1 <- rnorm(n, mean = 0, sd = 1)
x2 <- rnorm(n, mean = 0, sd = 1)
 
# Generating response variable based on a linear relationship with noise
y <- 2*x1 + 3*x2 + rnorm(n, mean = 0, sd = 0.5)
 
# Combining the features and response variable into a data frame
my_data <- data.frame(x1, x2, y)


Here’s an explanation of each line of code:

  1. set.seed(123): Sets the seed for reproducibility. By setting a seed, the generated random numbers will be the same each time the code is run.
  2. n <- 200: Specifies the number of observations. In this case, we have 200 observations.
  3. x1 <- rnorm(n, mean = 0, sd = 1): Generates random values for the predictor variable x1 from a normal distribution. The rnorm() function generates n random numbers with a mean of 0 and a standard deviation of 1.
  4. x2 <- rnorm(n, mean = 0, sd = 1): Generates random values for the predictor variable x2 using the same approach as in the previous line.
  5. y <- 2*x1 + 3*x2 + rnorm(n, mean = 0, sd = 0.5): Generates the response variable y based on a linear relationship with noise. It combines the predictor variables x1 and x2 with weights of 2 and 3, respectively. The rnorm() function generates random noise with a mean of 0 and a standard deviation of 0.5.
  6. my_data <- data.frame(x1, x2, y): Combines the predictor variables x1, x2, and the response variable y into a data frame called my_data. Each variable becomes a column in the data frame.

These steps generate a synthetic dataset with two predictor variables x1 and x2, and a response variable y that follows a linear relationship with some added noise.

Step 2: Split the Data

The dataset is split into training and testing sets using the createDataPartition() function from the caret package. The data is divided based on a specified proportion, and the resulting subsets are assigned to train_data and test_data.

R




# Step 2: Split the Data
library(caret)
 
set.seed(123)
split <- createDataPartition(my_data$y, p = 0.7, list = FALSE)
train_data <- my_data[split, ]
test_data <- my_data[-split, ]


Here’s an explanation of each line of code:

  1. library(caret): Loads the caret package, which provides functions for data splitting and preprocessing.
  2. set.seed(123): Sets the seed for reproducibility. This ensures that the data split will be the same each time the code is run.
  3. split <- createDataPartition(my_data$y, p = 0.7, list = FALSE): Uses the createDataPartition() function from the caret package to split the data into training and testing sets. The createDataPartition() function takes the response variable my_data$y and splits the data based on the specified proportion (p = 0.7, meaning 70% for training and 30% for testing). The list = FALSE parameter indicates that the result should be returned as a vector of indices.
  4. train_data <- my_data[split, ]: Assigns the training data by subsetting my_data using the indices obtained from the split vector. This extracts the rows corresponding to the indices in split and assigns them to train_data.
  5. test_data <- my_data[-split, ]: Assigns the testing data by subsetting my_data using the negative indices of split. This extracts the rows not included in split and assigns them to test_data.

By splitting the data into training and testing sets, we can use the training data to train the model and evaluate its performance on the testing data.

Step 3: Model Training

The nnet package is loaded, and a neural network model is trained using the nnet() function. The model is trained using the training data, and the formula specifies the relationship between the response variable and predictor variables. Additional parameters like the number of hidden nodes and maximum iterations are set.

R




# Step 3: Model Training
library(nnet)
 
# Train the neural network for regression
model <- nnet(y ~ x1 + x2, data = train_data, size = 5, maxit = 1000, linout = TRUE)


Output:

# weights:  21
initial value 1733.825618
iter 10 value 325.524695
iter 20 value 125.208619
iter 30 value 38.444341
iter 40 value 31.529450
iter 50 value 30.506822
iter 60 value 28.372888
iter 70 value 27.656570
iter 80 value 27.594377
iter 90 value 27.562448
iter 100 value 27.517448
iter 110 value 27.495839
iter 120 value 27.338752
iter 130 value 27.241432
iter 140 value 27.226825
iter 150 value 27.164398
iter 160 value 27.148384
iter 170 value 27.107991
iter 180 value 27.104630
iter 190 value 27.100312
iter 200 value 27.087177
iter 210 value 27.055133
iter 220 value 27.043583
iter 230 value 27.036930
iter 240 value 27.029102
iter 250 value 27.026189
iter 260 value 27.025619
iter 270 value 27.025371
iter 280 value 27.024260
iter 290 value 27.023630
iter 300 value 27.021395
iter 310 value 27.019450
iter 320 value 27.018149
iter 330 value 27.017640
iter 340 value 27.012681
iter 350 value 26.998627
iter 360 value 26.994418
iter 370 value 26.993394
iter 380 value 26.990601
iter 390 value 26.989293
iter 400 value 26.989118
iter 410 value 26.986376
iter 420 value 26.985672
iter 430 value 26.983334
iter 440 value 26.981814
iter 450 value 26.981003
iter 460 value 26.978700
iter 470 value 26.978033
iter 480 value 26.977101
iter 490 value 26.975884
iter 500 value 26.973281
iter 510 value 26.972808
iter 520 value 26.967412
iter 530 value 26.965805
iter 540 value 26.964680
iter 550 value 26.962324
iter 560 value 26.961902
iter 570 value 26.961771
iter 580 value 26.961526
iter 590 value 26.958141
iter 600 value 26.953665
iter 610 value 26.950821
iter 620 value 26.949362
iter 630 value 26.945554
iter 640 value 26.940451
final value 26.940071
converged

Here’s an explanation of the code:

  1. library(nnet): Loads the nnet package, which provides functions for training neural network models.
  2. model <- nnet(y ~ x1 + x2, data = train_data, size = 5, maxit = 1000, linout = TRUE): Trains the neural network model for regression using the nnet() function.
    • y ~ x1 + x2: Specifies the formula representing the relationship between the response variable y and the predictor variables x1 and x2.
    • data = train_data: Specifies the training dataset from which the model will learn.
    • size = 5: Specifies the number of nodes in the single hidden layer of the neural network. In this case, the hidden layer will have 5 nodes.
    • maxit = 1000: Specifies the maximum number of iterations or epochs for which the neural network will be trained. The model will stop training if it reaches this maximum number of iterations before convergence.
    • linout = TRUE: Sets the output function of the neural network to be linear, indicating that the model is trained for regression instead of classification. By default, the nnet() function assumes a logistic output function for classification tasks, but here we explicitly set it to linear for regression.

The trained model is stored in the model object and can be used for making predictions on new data.

Step 4: Model Evaluation

Predictions are made on the test data using the trained model and the predict() function. The caret the package is loaded, and the Root Mean Squared Error (RMSE) between the predicted and actual values is calculated using the caret::RMSE() function. The RMSE serves as a measure of the model’s performance in regression tasks.

R




# Step 4: Model Evaluation
# Making predictions on the test data
predictions <- predict(model, newdata = test_data)
 
# Evaluating the model's performance
library(caret)
 
# Calculate the Root Mean Squared Error (RMSE)
rmse <- caret::RMSE(predictions, test_data$y)
 
# Print the RMSE
print(paste("Root Mean Squared Error (RMSE):", rmse))


Here’s an explanation of each line of code:

  1. predictions <- predict(model, newdata = test_data): Uses the trained model to make predictions on the test data. The predict() function takes the trained model and the new data (test_data) as input and returns the predicted values based on the model.
  2. library(caret): Loads the caret package, which provides functions for model evaluation and performance metrics.
  3. rmse <- caret::RMSE(predictions, test_data$y): Calculates the Root Mean Squared Error (RMSE) between the predicted values (predictions) and the actual response values (test_data$y). The caret::RMSE() the function is used to compute the RMSE.
  4. print(paste("Root Mean Squared Error (RMSE):", rmse)): Prints the calculated RMSE value. The paste() function is used to concatenate the string “Root Mean Squared Error (RMSE):” with the actual RMSE value (rmse), and the result is printed to the console.
[1] "Root Mean Squared Error (RMSE): 0.572527937783828"

The RMSE is a common metric used to evaluate the performance of regression models. It measures the average magnitude of the residuals (the differences between the predicted and actual values) and provides an overall assessment of the model’s predictive accuracy.

In this way,the regression can be performed using the nnet package in R

To conclude we have learned how to perform classification and regression using the nnet package in R.



Neural Networks Using the R nnet Package

A neural network is a computational model inspired by the structure and function of the human brain. It consists of interconnected nodes, called neurons, organized into layers. The network receives input data, processes it through multiple layers of neurons, and produces an output or prediction.

The basic building block of a neural network is the neuron, which represents a computational unit. Each neuron takes input from other neurons or from the input data, performs a computation, and produces an output. The output of a neuron is typically determined by applying an activation function to the weighted sum of its inputs.

A neural network typically consists of three types of layers:

  1. Input Layer: This layer receives the input data and passes it to the next layer. Each neuron in the input layer corresponds to a feature or attribute of the input data.
  2. Hidden Layers: These layers are placed between the input and output layers and perform computations on the data. Each neuron in a hidden layer takes input from the neurons in the previous layer and produces an output that is passed to the neurons in the next layer. Hidden layers enable the network to learn complex patterns and relationships in the data.
  3. Output Layer: This layer produces the final output or prediction of the neural network. The number of neurons in the output layer depends on the nature of the problem. For example, in a binary classification problem, there may be one neuron representing the probability of one class and another neuron representing the probability of the other class. In a regression problem, there may be a single neuron representing the predicted numerical value.

Neaural Network Using R nnet package

During training, the neural network adjusts the weights and biases associated with each neuron to minimize the difference between the predicted output and the true output. This is achieved using an optimization algorithm, such as gradient descent, which iteratively updates the weights and biases based on the error or loss between the predicted and actual outputs.

The choice of activation function for the neurons is important, as it introduces non-linearity into the network. Common activation functions include the sigmoid function, ReLU (Rectified Linear Unit), and softmax. The activation function determines the output range of a neuron and affects the network’s ability to model complex relationships.

Neural networks can be applied to a wide range of tasks, including classification, regression, image recognition, natural language processing, and more. They have shown great success in many domains, but their performance depends on the quality and size of the training data, the network architecture, and the appropriate selection of hyperparameters.

Similar Reads

nnet Package in R

The “nnet” package in R is a widely used package that provides functions for building and training neural networks. It stands for “Feed-Forward Neural Networks and Multinomial Log-Linear Models.”...

Example 1: Classification

To use the R nnet package for classification with user-defined data....

Example 2: Regression

...

Contact Us