Práctica de análisis multivariante aplicado a las ciencias sociales (29/10/2018)

Objetivo: Aprender a hacer regresiones.

Pasos previos

  1. Abrir el fichero PISA2012.dta ubicado en c:\Users\Alumno\Desktop\Analisis.
  2. Cancelar la detención de la pantalla
  3. Solicitar medias de las variables para ver si no hay problemas.
  4. Sospechamos que no está bien ponderada y aplicamos la ponderación [weight=Peso]
  5. Tras comprobar que así está bien, generamos un macro global ($p) para evitar que tener que escribir la secuencia [aweight=Peso] en todas las regresiones.
  6. Empezar la apertura del fichero de resultados (practica6)

cd "c:\Users\Alumno\Desktop\Analisis\Datos"
use "PISA2012.dta", clear
set more off
summarize lectura-ciencias estatus edad
summarize lectura-ciencias estatus edad [weight=peso]
global p [weight=peso]
log using Practica6, replace

Examen de la variable dependiente

  1. Se recomienda analizar primero la variable dependiente.
. regress mates $p
(analytic weights assumed)
(sum of wgt is 25312.9756941)

      Source |       SS           df       MS      Number of obs   =    25,313
-------------+----------------------------------   F(0, 25312)     =      0.00
       Model |           0         0           .   Prob > F        =         .
    Residual |   193003106    25,312  7624.96469   R-squared       =    0.0000
-------------+----------------------------------   Adj R-squared   =    0.0000
       Total |   193003106    25,312  7624.96469   Root MSE        =    87.321

------------------------------------------------------------------------------
       mates |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   484.6267   .5488417   883.00   0.000      483.551    485.7025
------------------------------------------------------------------------------

Regresión simple

  1. Una regresión simple implica, además de la variable dependiente, otra variable que actúa como independiente.
  2. Tras una regresión se pueden solicitar valores predichos de un rango de valores.
  3. Tras solicitar este rango de valores, se puede solicitar el gráfico de regresión.
. regress mates edad $p
(analytic weights assumed)
(sum of wgt is 25312.9756941)

      Source |       SS           df       MS      Number of obs   =    25,313
-------------+----------------------------------   F(1, 25311)     =     33.93
       Model |  258410.612         1  258410.612   Prob > F        =    0.0000
    Residual |   192744696    25,311  7615.05653   R-squared       =    0.0013
-------------+----------------------------------   Adj R-squared   =    0.0013
       Total |   193003106    25,312  7624.96469   Root MSE        =    87.264

------------------------------------------------------------------------------
       mates |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        edad |   11.03439   1.894217     5.83   0.000     7.321616    14.74717
       _cons |   309.6336   30.04518    10.31   0.000     250.7433    368.5239
------------------------------------------------------------------------------

. regress mates estatus $p
(analytic weights assumed)
(sum of wgt is 25091.6302818)

      Source |       SS           df       MS      Number of obs   =    25,121
-------------+----------------------------------   F(1, 25119)     =   4639.22
       Model |    29415541         1    29415541   Prob > F        =    0.0000
    Residual |   159269938    25,119   6340.6162   R-squared       =    0.1559
-------------+----------------------------------   Adj R-squared   =    0.1559
       Total |   188685479    25,120  7511.36462   Root MSE        =    79.628

------------------------------------------------------------------------------
       mates |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     estatus |   33.35637   .4897293    68.11   0.000     32.39647    34.31627
       _cons |   492.0091   .5109143   963.00   0.000     491.0077    493.0105
------------------------------------------------------------------------------

. margins, at(estatus=(-3(1)+3))

Adjusted predictions                            Number of obs     =     25,121
Model VCE    : OLS

Expression   : Linear prediction, predict()

1._at        : estatus         =          -3

2._at        : estatus         =          -2

3._at        : estatus         =          -1

4._at        : estatus         =           0

5._at        : estatus         =           1

6._at        : estatus         =           2

7._at        : estatus         =           3

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |     391.94   1.465118   267.51   0.000     389.0683    394.8117
          2  |   425.2964   1.019014   417.36   0.000      423.299    427.2937
          3  |   458.6527   .6402161   716.40   0.000     457.3979    459.9076
          4  |   492.0091   .5109143   963.00   0.000     491.0077    493.0105
          5  |   525.3655   .7693243   682.89   0.000     523.8575    526.8734
          6  |   558.7218   1.184211   471.81   0.000     556.4007     561.043
          7  |   592.0782    1.64089   360.83   0.000      588.862    595.2944
------------------------------------------------------------------------------
. marginsplot

  Variables that uniquely identify margins: estatus

Regresión múltiple

  1. La regresión múltiple es prácticamente igual que la simple. Lo único es que se introducen más de una variable al mismo tiempo
. regress mates edad estatus $p
(analytic weights assumed)
(sum of wgt is 25091.6302818)

      Source |       SS           df       MS      Number of obs   =    25,121
-------------+----------------------------------   F(2, 25118)     =   2334.70
       Model |  29577883.1         2  14788941.5   Prob > F        =    0.0000
    Residual |   159107596    25,118  6334.40545   R-squared       =    0.1568
-------------+----------------------------------   Adj R-squared   =    0.1567
       Total |   188685479    25,120  7511.36462   Root MSE        =    79.589

------------------------------------------------------------------------------
       mates |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        edad |   8.780896   1.734507     5.06   0.000      5.38116    12.18063
     estatus |    33.3163   .4895534    68.05   0.000     32.35675    34.27585
       _cons |   352.7425   27.51435    12.82   0.000     298.8127    406.6722
------------------------------------------------------------------------------

. margins, at(estatus=(-3(1)+3))

Predictive margins                              Number of obs     =     25,121
Model VCE    : OLS

Expression   : Linear prediction, predict()

1._at        : estatus         =          -3

2._at        : estatus         =          -2

3._at        : estatus         =          -1

4._at        : estatus         =           0

5._at        : estatus         =           1

6._at        : estatus         =           2

7._at        : estatus         =           3

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   392.0526    1.46457   267.69   0.000     389.1819    394.9232
          2  |   425.3689   1.018616   417.60   0.000     423.3723    427.3654
          3  |   458.6852   .6399346   716.77   0.000     457.4309    459.9395
          4  |   492.0015   .5106662   963.45   0.000     491.0006    493.0024
          5  |   525.3178    .769005   683.11   0.000     523.8105    526.8251
          6  |   558.6341   1.183758   471.92   0.000     556.3139    560.9543
          7  |   591.9504   1.640281   360.88   0.000     588.7353    595.1654
------------------------------------------------------------------------------

Regresión múltiple con variables independientes cualitativas

  1. Con Stata es muy fácil introducir variables independientes cualitativas. Basta con predecerlas con i.
  2. Fijarse en este caso dónde se ubica la variable cualitativa en la instrucción margins
. regress mates edad estatus i.sexo $p
(analytic weights assumed)
(sum of wgt is 25091.6302818)

      Source |       SS           df       MS      Number of obs   =    25,121
-------------+----------------------------------   F(3, 25117)     =   1653.54
       Model |  31119370.7         3  10373123.6   Prob > F        =    0.0000
    Residual |   157566109    25,117  6273.28537   R-squared       =    0.1649
-------------+----------------------------------   Adj R-squared   =    0.1648
       Total |   188685479    25,120  7511.36462   Root MSE        =    79.204

------------------------------------------------------------------------------
       mates |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        edad |   8.910174   1.726139     5.16   0.000     5.526841    12.29351
     estatus |   33.23059   .4872165    68.20   0.000     32.27562    34.18557
      2.sexo |   15.66931   .9996029    15.68   0.000     13.71003    17.62859
       _cons |   342.7397   27.38872    12.51   0.000     289.0562    396.4232
------------------------------------------------------------------------------

. margins sexo, at(estatus=(-3(1)+3))

Predictive margins                              Number of obs     =     25,121
Model VCE    : OLS

Expression   : Linear prediction, predict()

1._at        : estatus         =          -3

2._at        : estatus         =          -2

3._at        : estatus         =          -1

4._at        : estatus         =           0

5._at        : estatus         =           1

6._at        : estatus         =           2

7._at        : estatus         =           3

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    _at#sexo |
        1 1  |   384.3572   1.537942   249.92   0.000     381.3428    387.3717
        1 2  |   400.0265   1.543706   259.13   0.000     397.0008    403.0523
        2 1  |   417.5878     1.1287   369.97   0.000     415.3755    419.8001
        2 2  |   433.2571   1.131722   382.83   0.000     431.0389    435.4754
        3 1  |   450.8184   .8108143   556.01   0.000     449.2292    452.4076
        3 2  |   466.4877   .8082829   577.13   0.000     464.9034     468.072
        4 1  |    484.049   .7180784   674.09   0.000     482.6415    485.4565
        4 2  |   499.7183   .7075358   706.28   0.000     498.3315    501.1051
        5 1  |   517.2796    .921202   561.53   0.000      515.474    519.0852
        5 2  |   532.9489   .9070019   587.59   0.000     531.1711    534.7267
        6 1  |   550.5102   1.286992   427.75   0.000     547.9876    553.0328
        6 2  |   566.1795   1.272579   444.91   0.000     563.6852    568.6738
        7 1  |   583.7408   1.714306   340.51   0.000     580.3806    587.1009
        7 2  |   599.4101   1.700301   352.53   0.000     596.0774    602.7428
------------------------------------------------------------------------------

. marginsplot

  Variables that uniquely identify margins: estatus sexo

Regresión múltiple con variables independientes cualitativas (más de dos valores).

  1. El procedimiento es el mismo que el anterior.
. regress mates edad estatus i.tipo $p
(analytic weights assumed)
(sum of wgt is 25091.6302818)

      Source |       SS           df       MS      Number of obs   =    25,121
-------------+----------------------------------   F(4, 25116)     =   1268.75
       Model |  31717434.1         4  7929358.54   Prob > F        =    0.0000
    Residual |   156968045    25,116  6249.72309   R-squared       =    0.1681
-------------+----------------------------------   Adj R-squared   =    0.1680
       Total |   188685479    25,120  7511.36462   Root MSE        =    79.055

------------------------------------------------------------------------------
       mates |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        edad |   8.737364     1.7229     5.07   0.000     5.360379    12.11435
     estatus |   30.66627   .5084744    60.31   0.000     29.66963     31.6629
             |
        tipo |
 Concertado  |   19.33684   1.213247    15.94   0.000      16.9588    21.71487
    Privado  |   23.08329   1.824699    12.65   0.000     19.50678    26.65981
             |
       _cons |   346.2176   27.33237    12.67   0.000     292.6445    399.7906
------------------------------------------------------------------------------

. margins tipo, at(estatus=(-3(1)+3))

Predictive margins                              Number of obs     =     25,121
Model VCE    : OLS

Expression   : Linear prediction, predict()

1._at        : estatus         =          -3

2._at        : estatus         =          -2

3._at        : estatus         =          -1

4._at        : estatus         =           0

5._at        : estatus         =           1

6._at        : estatus         =           2

7._at        : estatus         =           3

-------------------------------------------------------------------------------
              |            Delta-method
              |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
     _at#tipo |
   1#Público  |   392.7874   1.461863   268.69   0.000     389.9221    395.6528
1#Concertado  |   412.1243   1.876526   219.62   0.000     408.4462    415.8024
   1#Privado  |   415.8707   2.436074   170.71   0.000     411.0959    420.6456
   2#Público  |   423.4537   1.021853   414.40   0.000     421.4508    425.4566
2#Concertado  |   442.7905   1.475812   300.03   0.000     439.8978    445.6832
   2#Privado  |    446.537   2.092503   213.40   0.000     442.4356    450.6384
   3#Público  |     454.12   .6844096   663.52   0.000     452.7785    455.4614
3#Concertado  |   473.4568   1.162662   407.22   0.000     471.1779    475.7357
   3#Privado  |   477.2032   1.827505   261.12   0.000     473.6212    480.7853
   4#Público  |   484.7862   .6401111   757.35   0.000     483.5316    486.0409
4#Concertado  |   504.1231   1.021097   493.71   0.000     502.1216    506.1245
   4#Privado  |   507.8695   1.678711   302.54   0.000     504.5791    511.1599
   5#Público  |   515.4525   .9317512   553.21   0.000     513.6262    517.2788
5#Concertado  |   534.7893   1.118297   478.22   0.000     532.5974    536.9812
   5#Privado  |   538.5358   1.677337   321.07   0.000     535.2481    541.8235
   6#Público  |   546.1187   1.357818   402.20   0.000     543.4573    548.7802
6#Concertado  |   565.4556   1.405571   402.30   0.000     562.7006    568.2106
   6#Privado  |    569.202   1.823716   312.11   0.000     565.6275    572.7766
   7#Público  |    576.785   1.826547   315.78   0.000     573.2049    580.3652
7#Concertado  |   596.1219   1.793812   332.32   0.000     592.6059    599.6378
   7#Privado  |   599.8683   2.086987   287.43   0.000     595.7777    603.9589
-------------------------------------------------------------------------------

. marginsplot

  Variables that uniquely identify margins: estatus tipo

Regresión múltiple con variables independientes e interacciones

  1. En este caso se conectan las variables con ##. Las variables cuantitativas han de precederse con c.
. regress mates edad c.estatus##i.tipo $p
(analytic weights assumed)
(sum of wgt is 25091.6302818)

      Source |       SS           df       MS      Number of obs   =    25,121
-------------+----------------------------------   F(6, 25114)     =    849.41
       Model |  31831111.3         6  5305185.22   Prob > F        =    0.0000
    Residual |   156854368    25,114  6245.69435   R-squared       =    0.1687
-------------+----------------------------------   Adj R-squared   =    0.1685
       Total |   188685479    25,120  7511.36462   Root MSE        =     79.03

--------------------------------------------------------------------------------
         mates |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
          edad |   8.738245   1.722603     5.07   0.000     5.361843    12.11465
       estatus |   31.77387   .6148906    51.67   0.000     30.56865    32.97909
               |
          tipo |
   Concertado  |   19.00198   1.215395    15.63   0.000     16.61973    21.38422
      Privado  |   26.16467    2.01236    13.00   0.000     22.22033    30.10901
               |
tipo#c.estatus |
   Concertado  |  -2.059645   1.206315    -1.71   0.088    -4.424092    .3048023
      Privado  |   -8.00395   1.935939    -4.13   0.000     -11.7985   -4.209396
               |
         _cons |   346.6318    27.3295    12.68   0.000     293.0644    400.1992
--------------------------------------------------------------------------------

. margins tipo, at(estatus=(-3(1)+3))

Predictive margins                              Number of obs     =     25,121
Model VCE    : OLS

Expression   : Linear prediction, predict()

1._at        : estatus         =          -3

2._at        : estatus         =          -2

3._at        : estatus         =          -1

4._at        : estatus         =           0

5._at        : estatus         =           1

6._at        : estatus         =           2

7._at        : estatus         =           3

-------------------------------------------------------------------------------
              |            Delta-method
              |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
     _at#tipo |
   1#Público  |   389.8928   1.718523   226.88   0.000     386.5244    393.2612
1#Concertado  |   415.0737   3.372838   123.06   0.000     408.4628    421.6847
   1#Privado  |   440.0693    6.65171    66.16   0.000     427.0316    453.1071
   2#Público  |   421.6667   1.164111   362.22   0.000      419.385    423.9484
2#Concertado  |   444.7879   2.404186   185.01   0.000     440.0756    449.5003
   2#Privado  |   463.8392   4.895276    94.75   0.000     454.2442    473.4343
   3#Público  |   453.4405   .7163574   632.98   0.000     452.0364    454.8446
3#Concertado  |   474.5022   1.529068   310.32   0.000     471.5051    477.4992
   3#Privado  |   487.6092   3.228421   151.04   0.000     481.2813    493.9371
   4#Público  |   485.2144   .6537302   742.22   0.000     483.9331    486.4958
4#Concertado  |   504.2164   1.024624   492.10   0.000     502.2081    506.2247
   4#Privado  |   511.3791   1.903202   268.69   0.000     507.6487    515.1095
   5#Público  |   516.9883    1.04773   493.44   0.000     514.9347    519.0419
5#Concertado  |   533.9306   1.384025   385.78   0.000     531.2178    536.6434
   5#Privado  |    535.149   1.887365   283.54   0.000     531.4497    538.8484
   6#Público  |   548.7622   1.588803   345.39   0.000      545.648    551.8763
6#Concertado  |   563.6448     2.2215   253.72   0.000     559.2906    567.9991
   6#Privado  |   558.9189   3.200408   174.64   0.000     552.6459    565.1919
   7#Público  |    580.536    2.16957   267.58   0.000     576.2835    584.7885
7#Concertado  |   593.3591   3.179381   186.63   0.000     587.1273    599.5908
   7#Privado  |   582.6888   4.864522   119.78   0.000     573.1541    592.2236
-------------------------------------------------------------------------------

. marginsplot

  Variables that uniquely identify margins: estatus tipo

Fin

  1. Grabar todas las instrucciones buenas en un fichero de código o sintaxis.
  2. Se termina cerrando el fichero de resultados y grabando los cambios en los datos.
. log close
. save pisa2012r.dta, replace
file pisa2012r.dta saved

. exit

Final de la práctica.

  1. Subir el fichero de sintaxis a Studium. En la práctica “Primer fichero de Stata”.

Universidad de Salamanca. Grado de Sociología. Análisis Multivariable Aplicado a las Ciencias Sociales. Modesto Escobar