Can the alpha, lambda values of a glmnet object output determine whether ridge or Lasso?Why do Lars and Glmnet give different solutions for the Lasso problem?What does the varImp function in the caret package actually compute for a glmnet (elastic net) objectWhat are the differences between Ridge regression using R's glmnet and Python's scikit-learn?Using lasso regression in Matlab with constraints on lambda valuesHow to interpret the results when both ridge and lasso separately perform well but produce different coefficientsUsing different alpha values in glmnet when comparing two feature sets?r: coefficients from glmnet and caret are different for the same lambdaWhat is the difference between the lambda function in ridge regression vs in lasso regression?In LASSO, does it make sense to choose lambda based on the mean error associated with different lambda values, over multiple cross-validations?glmnet: Nested cross validation, tuning alpha and lambda

How do we create new idioms and use them in a novel?

Does Christianity allow for believing on someone else's behalf?

Does "Until when" sound natural for native speakers?

Can't make sense of a paragraph from Lovecraft

Getting the || sign while using Kurier

Recommendation letter by significant other if you worked with them professionally?

Is it possible that a question has only two answers?

What is the generally accepted pronunciation of “topoi”?

Can we track matter through time by looking at different depths in space?

School performs periodic password audits. Is my password compromised?

Is it possible to find 2014 distinct positive integers whose sum is divisible by each of them?

Vocabulary for giving just numbers, not a full answer

Outlet with 3 sets of wires

In the late 1940’s to early 1950’s what technology was available that could melt ice?

What's the 'present simple' form of the word "нашла́" in 3rd person singular female?

What are you allowed to do while using the Warlock's Eldritch Master feature?

The meaning of ‘otherwise’

Is this Paypal Github SDK reference really a dangerous site?

After `ssh` without `-X` to a machine, is it possible to change `$DISPLAY` to make it work like `ssh -X`?

Why is there an extra space when I type "ls" in the Desktop directory?

Specifying a starting column with colortbl package and xcolor

Why do we say ‘pairwise disjoint’, rather than ‘disjoint’?

How do spaceships determine each other's mass in space?

Virginia employer terminated employee and wants signing bonus returned



Can the alpha, lambda values of a glmnet object output determine whether ridge or Lasso?


Why do Lars and Glmnet give different solutions for the Lasso problem?What does the varImp function in the caret package actually compute for a glmnet (elastic net) objectWhat are the differences between Ridge regression using R's glmnet and Python's scikit-learn?Using lasso regression in Matlab with constraints on lambda valuesHow to interpret the results when both ridge and lasso separately perform well but produce different coefficientsUsing different alpha values in glmnet when comparing two feature sets?r: coefficients from glmnet and caret are different for the same lambdaWhat is the difference between the lambda function in ridge regression vs in lasso regression?In LASSO, does it make sense to choose lambda based on the mean error associated with different lambda values, over multiple cross-validations?glmnet: Nested cross validation, tuning alpha and lambda













2












$begingroup$


Given a glmnet object using train() where trControl method is "cv" and number of iterations is 5, I obtained that the bestTune alpha and lambda values are alpha=0.1 and lambda= 0.007688342. On running the glmnet object, I notice that the alpha values start from 0.1.
Can the inference here be that the method used is Lasso and not ridge because of the non-negative alpha value?



In general, can the values of alpha, lambda indicate which model is being used?










share|cite|improve this question







New contributor




red4life93 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$
















    2












    $begingroup$


    Given a glmnet object using train() where trControl method is "cv" and number of iterations is 5, I obtained that the bestTune alpha and lambda values are alpha=0.1 and lambda= 0.007688342. On running the glmnet object, I notice that the alpha values start from 0.1.
    Can the inference here be that the method used is Lasso and not ridge because of the non-negative alpha value?



    In general, can the values of alpha, lambda indicate which model is being used?










    share|cite|improve this question







    New contributor




    red4life93 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$














      2












      2








      2





      $begingroup$


      Given a glmnet object using train() where trControl method is "cv" and number of iterations is 5, I obtained that the bestTune alpha and lambda values are alpha=0.1 and lambda= 0.007688342. On running the glmnet object, I notice that the alpha values start from 0.1.
      Can the inference here be that the method used is Lasso and not ridge because of the non-negative alpha value?



      In general, can the values of alpha, lambda indicate which model is being used?










      share|cite|improve this question







      New contributor




      red4life93 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      Given a glmnet object using train() where trControl method is "cv" and number of iterations is 5, I obtained that the bestTune alpha and lambda values are alpha=0.1 and lambda= 0.007688342. On running the glmnet object, I notice that the alpha values start from 0.1.
      Can the inference here be that the method used is Lasso and not ridge because of the non-negative alpha value?



      In general, can the values of alpha, lambda indicate which model is being used?







      regression generalized-linear-model cross-validation caret






      share|cite|improve this question







      New contributor




      red4life93 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|cite|improve this question







      New contributor




      red4life93 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|cite|improve this question




      share|cite|improve this question






      New contributor




      red4life93 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 2 hours ago









      red4life93red4life93

      111




      111




      New contributor




      red4life93 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      red4life93 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      red4life93 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















          2 Answers
          2






          active

          oldest

          votes


















          1












          $begingroup$

          Absolutely! The $alpha$ parameter can be adjusted to either fit a Lasso or a Ridge regression (or something in between). Recall that the loss function which Elastic Net minimizes is $$frac12Nsum^N_i=1(y_i-beta_0-x_i^tbeta)^2+lambdasum_j=1^p(frac12(1-alpha)beta_j^2+alpha|beta_j|).$$
          Focus on the second big sum (the one multiplied by $lambda$). If you let $alpha=1$, the first term inside this sum becomes $0$, and the whole function becomes exactly the function that Lasso minimizes (or the Lasso loss function). If you let $alpha=0$, the second term becomes $0$ and you are left with Ridge.



          You can check the loss for Ridge and Lasso in this book and for elastic net in this paper.






          share|cite|improve this answer









          $endgroup$












          • $begingroup$
            This looks like a good answer but can you edit to include citations for the hyperlinks? Over time, links die.
            $endgroup$
            – Sycorax
            2 hours ago


















          1












          $begingroup$

          As far as I understand glmnet, $alpha=0$ would actually be a ridge penalty, and $alpha=1$ would be a Lasso penalty (rather than the other way around) and as far as glmnet is concerned you can fit those end cases.



          The penalty with $alpha=0.1$ would be fairly similar to the ridge penalty but it is not the ridge penalty; if it's not considering $alpha$ below $0.1$ you can't necessarily infer much more than that just from the fact that you had that endpoint. If you know that an $alpha$ value that was only slightly larger was worse then it would be likely that a larger range might have chosen a smaller $alpha$, but it doesn't suggest it would have been $0$; I expect it would not. If the grid of values is coarse it may well have been that a larger value than $0.1$ would be better.



          [You may want to check whether there was some other reason that $alpha$ might have been at an endpoint; e.g. I seem to recall $lambda$ got set to an endpoint in forecasting if coefficients for lambdaOpt were not saved.]






          share|cite|improve this answer











          $endgroup$












            Your Answer





            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "65"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );






            red4life93 is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396748%2fcan-the-alpha-lambda-values-of-a-glmnet-object-output-determine-whether-ridge-o%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1












            $begingroup$

            Absolutely! The $alpha$ parameter can be adjusted to either fit a Lasso or a Ridge regression (or something in between). Recall that the loss function which Elastic Net minimizes is $$frac12Nsum^N_i=1(y_i-beta_0-x_i^tbeta)^2+lambdasum_j=1^p(frac12(1-alpha)beta_j^2+alpha|beta_j|).$$
            Focus on the second big sum (the one multiplied by $lambda$). If you let $alpha=1$, the first term inside this sum becomes $0$, and the whole function becomes exactly the function that Lasso minimizes (or the Lasso loss function). If you let $alpha=0$, the second term becomes $0$ and you are left with Ridge.



            You can check the loss for Ridge and Lasso in this book and for elastic net in this paper.






            share|cite|improve this answer









            $endgroup$












            • $begingroup$
              This looks like a good answer but can you edit to include citations for the hyperlinks? Over time, links die.
              $endgroup$
              – Sycorax
              2 hours ago















            1












            $begingroup$

            Absolutely! The $alpha$ parameter can be adjusted to either fit a Lasso or a Ridge regression (or something in between). Recall that the loss function which Elastic Net minimizes is $$frac12Nsum^N_i=1(y_i-beta_0-x_i^tbeta)^2+lambdasum_j=1^p(frac12(1-alpha)beta_j^2+alpha|beta_j|).$$
            Focus on the second big sum (the one multiplied by $lambda$). If you let $alpha=1$, the first term inside this sum becomes $0$, and the whole function becomes exactly the function that Lasso minimizes (or the Lasso loss function). If you let $alpha=0$, the second term becomes $0$ and you are left with Ridge.



            You can check the loss for Ridge and Lasso in this book and for elastic net in this paper.






            share|cite|improve this answer









            $endgroup$












            • $begingroup$
              This looks like a good answer but can you edit to include citations for the hyperlinks? Over time, links die.
              $endgroup$
              – Sycorax
              2 hours ago













            1












            1








            1





            $begingroup$

            Absolutely! The $alpha$ parameter can be adjusted to either fit a Lasso or a Ridge regression (or something in between). Recall that the loss function which Elastic Net minimizes is $$frac12Nsum^N_i=1(y_i-beta_0-x_i^tbeta)^2+lambdasum_j=1^p(frac12(1-alpha)beta_j^2+alpha|beta_j|).$$
            Focus on the second big sum (the one multiplied by $lambda$). If you let $alpha=1$, the first term inside this sum becomes $0$, and the whole function becomes exactly the function that Lasso minimizes (or the Lasso loss function). If you let $alpha=0$, the second term becomes $0$ and you are left with Ridge.



            You can check the loss for Ridge and Lasso in this book and for elastic net in this paper.






            share|cite|improve this answer









            $endgroup$



            Absolutely! The $alpha$ parameter can be adjusted to either fit a Lasso or a Ridge regression (or something in between). Recall that the loss function which Elastic Net minimizes is $$frac12Nsum^N_i=1(y_i-beta_0-x_i^tbeta)^2+lambdasum_j=1^p(frac12(1-alpha)beta_j^2+alpha|beta_j|).$$
            Focus on the second big sum (the one multiplied by $lambda$). If you let $alpha=1$, the first term inside this sum becomes $0$, and the whole function becomes exactly the function that Lasso minimizes (or the Lasso loss function). If you let $alpha=0$, the second term becomes $0$ and you are left with Ridge.



            You can check the loss for Ridge and Lasso in this book and for elastic net in this paper.







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered 2 hours ago









            BananinBananin

            1795




            1795











            • $begingroup$
              This looks like a good answer but can you edit to include citations for the hyperlinks? Over time, links die.
              $endgroup$
              – Sycorax
              2 hours ago
















            • $begingroup$
              This looks like a good answer but can you edit to include citations for the hyperlinks? Over time, links die.
              $endgroup$
              – Sycorax
              2 hours ago















            $begingroup$
            This looks like a good answer but can you edit to include citations for the hyperlinks? Over time, links die.
            $endgroup$
            – Sycorax
            2 hours ago




            $begingroup$
            This looks like a good answer but can you edit to include citations for the hyperlinks? Over time, links die.
            $endgroup$
            – Sycorax
            2 hours ago













            1












            $begingroup$

            As far as I understand glmnet, $alpha=0$ would actually be a ridge penalty, and $alpha=1$ would be a Lasso penalty (rather than the other way around) and as far as glmnet is concerned you can fit those end cases.



            The penalty with $alpha=0.1$ would be fairly similar to the ridge penalty but it is not the ridge penalty; if it's not considering $alpha$ below $0.1$ you can't necessarily infer much more than that just from the fact that you had that endpoint. If you know that an $alpha$ value that was only slightly larger was worse then it would be likely that a larger range might have chosen a smaller $alpha$, but it doesn't suggest it would have been $0$; I expect it would not. If the grid of values is coarse it may well have been that a larger value than $0.1$ would be better.



            [You may want to check whether there was some other reason that $alpha$ might have been at an endpoint; e.g. I seem to recall $lambda$ got set to an endpoint in forecasting if coefficients for lambdaOpt were not saved.]






            share|cite|improve this answer











            $endgroup$

















              1












              $begingroup$

              As far as I understand glmnet, $alpha=0$ would actually be a ridge penalty, and $alpha=1$ would be a Lasso penalty (rather than the other way around) and as far as glmnet is concerned you can fit those end cases.



              The penalty with $alpha=0.1$ would be fairly similar to the ridge penalty but it is not the ridge penalty; if it's not considering $alpha$ below $0.1$ you can't necessarily infer much more than that just from the fact that you had that endpoint. If you know that an $alpha$ value that was only slightly larger was worse then it would be likely that a larger range might have chosen a smaller $alpha$, but it doesn't suggest it would have been $0$; I expect it would not. If the grid of values is coarse it may well have been that a larger value than $0.1$ would be better.



              [You may want to check whether there was some other reason that $alpha$ might have been at an endpoint; e.g. I seem to recall $lambda$ got set to an endpoint in forecasting if coefficients for lambdaOpt were not saved.]






              share|cite|improve this answer











              $endgroup$















                1












                1








                1





                $begingroup$

                As far as I understand glmnet, $alpha=0$ would actually be a ridge penalty, and $alpha=1$ would be a Lasso penalty (rather than the other way around) and as far as glmnet is concerned you can fit those end cases.



                The penalty with $alpha=0.1$ would be fairly similar to the ridge penalty but it is not the ridge penalty; if it's not considering $alpha$ below $0.1$ you can't necessarily infer much more than that just from the fact that you had that endpoint. If you know that an $alpha$ value that was only slightly larger was worse then it would be likely that a larger range might have chosen a smaller $alpha$, but it doesn't suggest it would have been $0$; I expect it would not. If the grid of values is coarse it may well have been that a larger value than $0.1$ would be better.



                [You may want to check whether there was some other reason that $alpha$ might have been at an endpoint; e.g. I seem to recall $lambda$ got set to an endpoint in forecasting if coefficients for lambdaOpt were not saved.]






                share|cite|improve this answer











                $endgroup$



                As far as I understand glmnet, $alpha=0$ would actually be a ridge penalty, and $alpha=1$ would be a Lasso penalty (rather than the other way around) and as far as glmnet is concerned you can fit those end cases.



                The penalty with $alpha=0.1$ would be fairly similar to the ridge penalty but it is not the ridge penalty; if it's not considering $alpha$ below $0.1$ you can't necessarily infer much more than that just from the fact that you had that endpoint. If you know that an $alpha$ value that was only slightly larger was worse then it would be likely that a larger range might have chosen a smaller $alpha$, but it doesn't suggest it would have been $0$; I expect it would not. If the grid of values is coarse it may well have been that a larger value than $0.1$ would be better.



                [You may want to check whether there was some other reason that $alpha$ might have been at an endpoint; e.g. I seem to recall $lambda$ got set to an endpoint in forecasting if coefficients for lambdaOpt were not saved.]







                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                edited 2 hours ago

























                answered 2 hours ago









                Glen_bGlen_b

                213k22412762




                213k22412762




















                    red4life93 is a new contributor. Be nice, and check out our Code of Conduct.









                    draft saved

                    draft discarded


















                    red4life93 is a new contributor. Be nice, and check out our Code of Conduct.












                    red4life93 is a new contributor. Be nice, and check out our Code of Conduct.











                    red4life93 is a new contributor. Be nice, and check out our Code of Conduct.














                    Thanks for contributing an answer to Cross Validated!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396748%2fcan-the-alpha-lambda-values-of-a-glmnet-object-output-determine-whether-ridge-o%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Isabella Eugénie Boyer Biographie | Références | Menu de navigationmodifiermodifier le codeComparator to Compute the Relative Value of a U.S. Dollar Amount – 1774 to Present.

                    Lioubotyn Sommaire Géographie | Histoire | Population | Notes et références | Liens externes | Menu de navigationlubotin.kharkov.uamodifier« Recensements et estimations de la population depuis 1897 »« Office des statistiques d'Ukraine : population au 1er janvier 2010, 2011 et 2012 »« Office des statistiques d'Ukraine : population au 1er janvier 2011, 2012 et 2013 »Informations officiellesCartes topographiquesCarte routièrem

                    Mpande kaSenzangakhona Biographie | Références | Menu de navigationmodifierMpande kaSenzangakhonavoir la liste des auteursm