Coursera: Machine Learning-Andrew NG (Week 3) Quiz - Regularization - codemummy |online technical computer science platform.

These solutions are for reference only.

try to solve on your own

but if you get stuck in between than you can refer these solutions

there are different set of questions ,

we have provided the variations in particular question at the end.

read questions carefully before marking

-----------------------------------------------------------------------------------------

   Adding a new feature to the model always results in equal or better performance on the training set.   

  Introducing regularization to the model always results in equal or better performance on the training set.  

  Adding many new features to the model helps prevent overfitting on the training set.   

  Introducing regularization to the model always results in equal or better performance on examples not in the training set.   

EXPLANATION:

Adding a new feature to the model always results in equal or better performance on the training set. (True )
Adding many new features gives us more expressive models which are able to better fit our training set. If too many new features are added, this can lead to overfitting of the training set.

Introducing regularization to the model always results in equal or better performance on the training set. (False)
=>If we introduce too much regularization, we can underfit the training set and have worse performance on the training set.

Adding many new features to the model helps prevent overfitting on the training set. (False)
=>Adding many new features gives us more expressive models which are able to better fit our training set. If too many new features are added, this can lead to overfitting of the training set.

Introducing regularization to the model always results in equal or better performance on examples not in the training set.(False)
=>If we introduce too much regularization, we can underfit the training set and this can lead to worse performance even for examples not in the training set.

Question 2

        Suppose you ran logistic regression twice, once with \lambda =         0λ=0, and once with \lambda = 1λ=1.  One of the times, you got
        parameters \theta = 
θ=[81.4712.69​], and the other time you got
        \theta 
θ=[13.010.91​].  However, you forgot which value of
        \lambdaλ corresponds to which value of \thetaθ.  Which one do you
        think corresponds to \lambda = 1λ=1?

1 point

θ=[81.4712.69​]

\theta = 
θ=[13.010.91​]

EXPLANATION:

$θ = [13.01 0.91]$

When λ is set to 1, We use regularization to penalize large value of θ. Thus, the parameter, θ, obtained will in general have smaller values.

variation to the above 2nd question is provided at the end

   Using a very large value of \lambdaλ cannot hurt the performance of your hypothesis; the only reason we do not set \lambdaλ to be too large is to avoid numerical problems.  

  Consider a classification problem.  Adding regularization may cause your classifier to incorrectly classify some training examples (which it had correctly classified when not using regularization, i.e. when \lambda = 0λ=0).  

  Using too large a value of \lambdaλ can cause your hypothesis to overfit the data; this can be avoided by reducing \lambdaλ.   

  Because logistic regression outputs values 0 \leq h_\theta(x) \leq 10≤hθ​(x)≤1, its range of output values can only be "shrunk" slightly by regularization anyway, so regularization is generally not helpful for it.   

variation to the above 3 rd question is provided at the end

Figure:

Figure:

Figure:

Figure:

EXPLANATION:

The hypothesis follows the data points very closely and is highly complicated, indicating that it is overfitting the training set

Figure:

Figure:

Figure:

Figure:

EXPLANATION:
The hypothesis does not predict many data points well, and is thus underfitting the training set.

------------------------------------------------------------------------

variations in 3rd question:

   Using a very large value of \lambdaλ cannot hurt the performance of your hypothesis; the only reason we do not set \lambdaλ to be too large is to avoid numerical problems.  

  Because regularization causes J(\theta)J(θ) to no longer be convex, gradient descent may not always converge to the global minimum (when \lambda > 0λ>0, and when using an appropriate learning rate \alphaα).   

Because logistic regression outputs values 0 \leq h_\theta(x) \leq 10≤hθ​(x)≤1, its range of output values can only be "shrunk" slightly by regularization anyway, so regularization is generally not helpful for it.   

  Using too large a value of \lambdaλ can cause your hypothesis to underfit the data.   

EXPLANATION:

Using a very large value λ cannot hurt the performance of your hypothesis; the only reason we do not set to be too large is to avoid numerical problems. (False)
=>Using a very large value of λ can lead to underfitting of the training set.

Because regularization causes J(θ) to no longer be convex, gradient descent may not always converge to the global minimum (when λ > 0, and when using an appropriate learning rate α).(False)
=>Regularized logistic regression and regularized linear regression are both convex, and thus gradient descent will still converge to the global minimum.

Using too large a value of λ can cause your hypothesis to underfit the data.(True)
=>A large value of results in a large λ regularization penalty and thus a strong preference for simpler models which can underfit the data.

Because logistic regression outputs values 0 <= h0 <= 1, its range of output values can only be "shrunk" slighly by regularization anyway, so regularization is generally not helpful for it.(False)
=>None needed

variations in 2nd question:

Question 2

        Suppose you ran logistic regression twice, once with \lambda =         0λ=0, and once with \lambda = 1λ=1.  One of the times, you got
        parameters \theta = 
θ=[26.2965.41​], and the other time you got
        \theta 
θ=[2.751.32​].  However, you forgot which value of
        \lambdaλ corresponds to which value of \thetaθ.  Which one do you
        think corresponds to \lambda = 1λ=1?

1 point

θ=[26.2965.41​]

θ=[2.751.32​].

Question 2

        Suppose you ran logistic regression twice, once with \lambda =         0λ=0, and once with \lambda = 1λ=1.  One of the times, you got
        parameters \theta = 
θ=[74.81, 
       45.05], 
and the other time you got
θ=[1.37,
       0.51]  
  However, you forgot which value of
        \lambdaλ corresponds to which value of \thetaθ.  Which one do you
        think corresponds to \lambda = 1λ=1?

1 point

θ=[74.81,
       45.05]

θ=[1.37,
       0.51]  

---------------------------------------------------------------------------------

reference : coursera

Coursera: Machine Learning-Andrew NG (Week 3) Quiz - Regularization

Regularization

variations in 3rd question:

variations in 2nd question:

Top Coding Questions

Quantitative Aptitude

Popular Posts

Tags