Stataͳ¼Æ·ÖÎö³£ÓÃÃüÁî»ã×Ü
Ò»¡¢winsorize¼«¶ËÖµ´¦Àí
·¶Î§£ºÒ»°ãÔÚ1%ºÍ99%·Öλ×ö¼«¶ËÖµ´¦Àí£¬¶ÔÓÚСÓÚ1%µÄÊýÓÃ1%µÄÖµ¸³Öµ£¬¶ÔÓÚ´óÓÚ99%µÄÊýÓÃ99%µÄÖµ¸³Öµ¡£
1¡¢StataÖеĵ¥±äÁ¿¼«¶ËÖµ´¦Àí£º
stata 11.0£¬ÔÚÃüÁî´°¿ÚÊäÈë¡°findit winsor¡±ºó£¬ÏµÍ³µ¯³öÒ»¸ö´°¿Ú£¬°²×°winsorÄ£¿é °²×°ºÃÄ£¿éÖ®ºó£¬¾Í¿ÉÒÔµ÷ÓÃwinsorÃüÁÃüÁî¸ñʽ£ºwinsor var1, gen(new var) p(0.01) »òÕßÔÚÃüÁî´°¿ÚÖÐÊäÈ룺ssc install winsor°²×°winsorÃüÁî¡£winsorÃüÁî²»ÄܽøÐÐÅúÁ¿´¦Àí¡£
2¡¢ÅúÁ¿½øÐÐwinsorize¼«¶ËÖµ´¦Àí£º
´ò¿ªÁ´½Ó£ºhttp://personal.anderson.ucla.edu/judson.caskey/data.html£¬ÕÒµ½winsorizeJ£¬µã»÷ÓÒ¼ü£¬Áí´æΪµ½stataÖеÄado/plus/Ŀ¼Ï¼´¿É¡£ÃüÁî¸ñʽ£ºwinsorizeJ var1var2var3,suffix(w)¼´¿É£¬ÕâÑù»áÉú³ÉÈý¸öбäÁ¿£¬var1w var2w var3w£¬¶øÇÒĬÈϵÄÊÇÉÏÏÂ1%winsorize¡£Èç¹ûÒªÐ޸ķÖλµã£¬Ôòд³ÉÈçϸñʽ£ºwinsorizeJ var 1 var2 var3,suffix(w) cuts(5 95)¡£ 3¡¢ExcelÖеļ«¶ËÖµ´¦Àí£º£¨ÂÔ£© winsor2 ÃüÁîʹÓÃ˵Ã÷
¼ò½é£ºwinsor2 winsorize or trim (if trim option is specified) the variables in varlist at particular percentiles specified by option cuts(# #). In defult, new variables will be generated with a suffix \variables with their winsorized or trimmed ones.
Ïà±ÈÓÚwinsorÃüÁîµÄ¸Ä½ø£º (1) ¿ÉÒÔÅúÁ¿´¦Àí¶à¸ö±äÁ¿£»
(2) ²»½ö¿ÉÒÔ winsor£¬Ò²¿ÉÒÔ trimming£»
(3) ¸½¼ÓÁË by() Ñ¡Ï¿ÉÒÔ·Ö×é winsor »ò trimming£»
(4) Ôö¼ÓÁË replace Ñ¡Ï¿ÉÒÔ²»±ØÉú³ÉбäÁ¿£¬Ö±½ÓÌæ»»Ô±äÁ¿¡£
·¶Àý£º
*- winsor at (p1 p99), get new variable \ . sysuse nlsw88, clear . winsor2 wage
*- left-trimming at 2th percentile . winsor2 wage, cuts(2 100) trim
*- winsor variables by (industry south), overwrite the old variables . winsor2 wage hours, replace by(industry south)
ʹÓ÷½·¨:
1. Ç뽫 winsor2.ado ºÍ winsor2.sthlp ·ÅÖÃÓÚ stata12\\ado\\base\\w Îļþ¼ÐÏ£» 2. ÊäÈë help winsor2 ¿ÉÒԲ鿴°ïÖúÎļþ£»
¶þ¡¢ÃèÊöÐÔͳ¼Æ
1¡¢summarize
ÃüÁî¸ñʽ£ºsu¡¢sum»òÕßsummarize [varlist] [if] [in] [weight] [,options]
Èç¹ûsummarize»òsumºó²»¼ÓÈκαäÁ¿£¬ÔòĬÈ϶ÔÊý¾ÝÖеÄËùÓбäÁ¿½øÐÐÃèÊöͳ¼Æ options Ñ¡Ïdetail ±íʾ²úÉú¸ü¼ÓÏêϸµÄͳ¼Æ±äÁ¿
Separator£¨n£©±íʾÿn¸ö±äÁ¿»Ò»Ìõ·Ö½çÏߣ¬n=0±íʾ½ûֹʹÓ÷ֽçÏß
Summarize ÃèÊöͳ¼ÆÊä³ö±íÖаüº¬£ºÑù±¾ÈÝÁ¿¡¢Æ½¾ùÊý¡¢±ê×¼²î¡¢×îСֵºÍ×î´óÖµ 2¡¢tabstat
ÃüÁî¸ñʽ£ºtabstat [varlist] [if] [in] [weight] [,options]
options Ñ¡Ïstat(statname) ±íʾÉ趨ËùÐèÒªµÄͳ¼ÆÁ¿ col(stat)»òc(s)±íʾ½«½á¹û±¨±íתÖà ͳ¼ÆÁ¿£º
mean£ºÆ½¾ùÊý count/n£º¹Û²âÖµÊýÄ¿ sum£º¼Ó×Ü
max/min £º×î´óÖµ/×îСֵ range £º¼«²î sd£º±ê×¼²î cv£º±äÒìϵÊý semean £ºÆ½¾ù±ê×¼Îó²î skewness£ºÆ«¶Èvar £º·½²î
kurtosis £º·å¶È median/p50£ºÖÐλÊý p# £º#%°Ù·ÖλÊý ÀýÈ磺tabstat[varlist],stat(count mean sd median min max range) col(stat) 3¡¢ÃèÊöÐÔͳ¼Æ½á¹ûÊä³öµ½word»òExcel
ÓÃsum×öµÄÃèÊöÐÔͳ¼Æ£ºlogout, save(miaoshutongji) word replace:sum
ÓÃtabstat×öµÄÃèÊöÐÔͳ¼Æ£ºlogout, save(miaoshutongji) word replace:tabstat [varlist] ,stat(count mean sd median min max range) col(stat) ·Ö×éÃèÊö£ºbysort var:
Èý¡¢Ïà¹ØÐÔ·ÖÎö
£¨Ò»£©Ïà¹ØÐÔ·ÖÎö 1¡¢PearsonÏà¹ØϵÊýÃüÁî¸ñʽ£ºcorrelate£¨¼òд£ºcor»òcorr£©[varlist] [if] [in] [weight] [,options] 2¡¢spearmanÏà¹ØϵÊýÃüÁî¸ñʽ£ºspearman[varlist], stats(rho p)
3¡¢ÔÚStataÖУ¬ÃüÁîcorrÓÃÓÚ¼ÆËãÒ»×é±äÁ¿¼äµÄз½²î»òÏà¹ØϵÊý¾ØÕó£»
4¡¢ÃüÁîpwcorr¿ÉÓÃÓÚ¼ÆËãÒ»×é±äÁ¿ÖÐÁ½Á½±äÁ¿µÄÏà¹ØϵÊý£¬Í¬Ê±»¹¿ÉÒÔ¶ÔÏà¹ØϵÊýµÄÏÔÖøÐÔ½øÐмìÑ飻optionÑ¡ÏîÖмÓÉÏsig¿ÉÏÔʾÏÔÖøÐÔˮƽ£ºpwcorr[varlist] ,sig
5¡¢ÃüÁîpcorr ÓÃÓÚ¼ÆËãÒ»×é±äÁ¿ÖÐÁ½Á½±äÁ¿µÄÆ«Ïà¹ØϵÊý²¢½øÐÐÏÔÖøÐÔ¼ìÑé¡£ 6¡¢Spearman ºÍ Pearson ¼ìÑéͬÔÚÒ»¸ö±íµÄÃüÁcorrtbl[varlist] ,corrvars ([varlist])
Êä³ö½á¹ûÖУ¬ÉÏÈý½ÇΪSpearmanÏà¹ØϵÊýºÍÏÔÖøˮƽ£¬ÏÂÈý½ÇΪPearsonϵÊýºÍÏÔÖøˮƽ¡£ £¨¶þ£©Êä³öÏà¹ØϵÊý±íµ½word»òExcelÖÐ
ÀýÈ磺logout, save(mytable) word replace: pwcorr_a price mpg rep78 headroom trunk, star1(0.01) star5(0.05) star10(0.1)
ËÄ¡¢½ØÃæÊý¾Ýµ¥·½³ÌÏßÐԻعéÄ£Ð͵ÄStataʵÏÖ
ÃüÁî¸ñʽ£ºregress£¨¼òд£ºreg£©depvar indepvars [if] [in] [weigh] [option] £¨depvar±íʾÒò±äÁ¿£¬ indepvars±íʾ×Ô±äÁ¿£©
Îå¡¢Òì·½²îµÄ¼ìÑéÓë´¦Àí
1¡¢¼ìÑéÒì·½²îÃüÁî¸ñʽ£ºhettest 2¡¢ÅжÏÒì·½²îµÄ±ê×¼£º
¿´PÖµµÄ´óСÀ´Åжϣ¬Èç¹ûPֵСÓÚ0.05£¬Ôò²»ÄÜÅųýÒì·½²îµÄ¿ÉÄÜ£¬ÉÏͼÖÐPÖµµÈÓÚ0.4584>0.05£¬Òò´Ë£¬¿ÉÒÔÅųýÒì·½²îµÄ¿ÉÄÜÐÔ¡£
3¡¢´¦ÀíÒì·½²îÃüÁî¸ñʽ£ºÔÚregÃüÁîºó¼ÓÉÏ¡°,r¡±»òÕß¡°,robust¡±¼´¿É¡£¾Òì·½²î´¦ÀíºóµÄ»Ø¹é²»ÏÔʾµ÷ÕûºóµÄR2£¨adj-R2£©£¬Èç¹ûÒª²é¿´µ÷ÕûºóµÄR2£¬ÔÙÊäÈëÃüÁdi e(r2_a)
Áù¡¢¶àÖع²ÏßÐÔ£¨×Ô±äÁ¿Ö®¼ä¸ß¶ÈÏà¹Ø£©ÃüÁî¸ñʽ£ºvif
£¨Ò»£©Åж϶àÖع²ÏßÐԵıê×¼£¨Á½¸ö±ê×¼±ØÐëͬʱÂú×㣩£º 1¡¢×î´óµÄvif´óÓÚ10£» 2¡¢Æ½¾ùµÄvif´óÓÚ1 ¡£ £¨¶þ£©¶àÖع²ÏßÐÔµÄÐÞÕý
1¡¢²ÉÓÃÖ𲽻عé½øÐÐÐÞÕý£¬ÃüÁî¸ñʽ£ºsw reg depvar indepvar, pr(0.05)
2¡¢¶ÔÓÚº¬¶þ´ÎÏîµÄ£¬Ê¹Óá°¶ÔÖС±µÄ·½·¨£¬¼È¿ÉÒÔ±£Áô¶þ´ÎÏÓÖ¿ÉÒÔÔÚÒ»¶¨³Ì¶ÈÉÏ¿Ë·þ¶àÖع²ÏßÐÔµÄÎÊÌ⣺Ïȶ¨ÒåÁ½¸ö±äÁ¿£¬·Ö±ðΪ¸Ã±äÁ¿¼õÈ¥Æä¾ùÖµºÍ¸Ã±äÁ¿µÄƽ·½£¬ÃüÁîÈçÏ£º sum var
gen var1=var-r(mean) gen var2=var^2
ÔÙÓÃбäÁ¿´úÌæÔÀ´µÄ±äÁ¿½øÐлع鴦Àí
Æß¡¢ÄÚÉúÐԵļìÑéÓë´¦Àí£¨ÄÚÉúÐÔÊÇÖ¸×Ô±äÁ¿ÓëÎó²îÏîÖ®¼äÓйØϵ£©
1¡¢ÄÚÉúÐԵļìÑ飺ovtest
¿´PÖµµÄ´óСÀ´Åжϣ¬Èç¹ûPֵСÓÚ0.05£¬Ôò²»ÄÜÅųýÄÚÉúÐԵĿÉÄÜ£¬ÉÏͼÖÐPÖµµÈÓÚ0.4717>0.05£¬Òò´Ë£¬¿ÉÒÔÅųýÄÚÉúÐԵĿÉÄÜ¡£ 2¡¢ÄÚÉúÐԵĴ¦Àí£ºÊ¹Óù¤¾ß±äÁ¿·¨£ºivreg
ÄÚÉúÐÔµÄÈý¸öÀ´Ô´£º²âÁ¿Îó²î¡¢ÒÅ©±äÁ¿ºÍË«ÏòÒò¹û¡£ 1¡¢±äÁ¿µÄÄÚÉúÐÔ¡£
Õâ¸öÊÇûÓа취µ¥¶À¼ìÑéµÄ¡£µ±ÓкÏÊʹ¤¾ß±äÁ¿Ê±ºò£¬ÊÇ¿ÉÒÔ¼ìÑéµÄ£¬¾ÍÊÇhausman¼ìÑé
2¡¢¹¤¾ß±äÁ¿µÄÍâÉúÐÔ¡£
Õâ¸öÒ²ÊÇû°ì·¨¼ìÑéµÄ¡£µ±Óкܶ๤¾ß±äÁ¿Ê±ºò£¬¿ÉÒÔ¼ìÑéÊÇ·ñÓв»ÊÇÍâÉúµÄ£¬¾ÍÊÇ¡°¹ý¶Èʶ±ð¡±ÎÊÌâ
3¡¢¹¤¾ß±äÁ¿µÄÏà¹ØÐÔ¡£
Õâ¸ö¿ÉÒÔ˵³ÉÊÇ¡°Èõ¹¤¾ß±äÁ¿¡±ÎÊÌ⣬¼ìÑé¿ÉÒÔͨ¹ýÒ»½×¶ÎµÄFÖµ¡£»¹¿ÉÒÔÀûÓÃPartial R2¡£ 4¡¢¹À¼Æ·½·¨
stataÀïÃæÓÐÕâô¼¸¸ö2sls£¬2sls smal¡¢liml¡¢gmm£¬¸÷×ÔÊÊÓÃÇé¿ö£ºsmallÊʺÏСÑù±¾£»limlÊʺÏÈõ¹¤¾ß±äÁ¿£»gmmÊʺÏÒì·½²î¡£ ¡¾Àý×Ó¡¿ webuse hsng2
*Fit a regression via 2SLS, requesting small-sample statistics ivregress 2sls rent pcturban (hsngval = faminc iregion), small *Fit a regression using the LIML estimator
ivregress liml rent pcturban (hsngval = faminc iregion)
*Fit a regression via GMM using the default heteroskedasticity-robust weight matrix ivregress gmm rent pcturban (hsngval = faminc iregion)
*Fit a regression via GMM using a heteroskedasticity-robust weight matrix, requesting nonrobust standard errors
ivregress gmm rent pcturban (hsngval = faminc iregion), vce(unadjusted) *¼ìÑé
estata firststage ,all forcenonrobust \\\\\\¿ÉÒԲ鿴µÚÒ»½×¶ÎFÖµ£¬ÒѾpartial R2 estat overid \\\\\\²é¿´ÊÇ·ñ¹ý¶Èʶ±ð estat endogenous \\\\\\²é¿´ÊÇ·ñÒì·½²î
regress 2sls rent pcturban hsngval est store m1
ivregress 2sls rent pcturban (hsngval = faminc iregion) est store m2
hausman m1 m2 \\\\\\ÄÚÉú¼ìÑé
°Ë¡¢ÏßÐÔ·½³Ì×éµÄ»Ø¹é·ÖÎö
ÃüÁî¸ñʽ£ºsureg(depvar1 varlist1)(depvar2 varlist2)¡(depvarN varlistN) [if] [in] [weigh]
¾Å¡¢ÁªÁ¢·½³Ì×é
ÃüÁî¸ñʽ£ºreg3 (depvar1 varlist1)(depvar2 varlist2)¡(depvarN varlistN) [if] [in] [weigh]
Ê®¡¢Ãæ°åÊý¾ÝµÄ¹Ì¶¨Ð§Ó¦ºÍËæ»úЧӦ Xtset
¹Ì¶¨Ð§Ó¦ÃüÁî¸ñʽ£ºxtreg depvar indepvars [if] [in] ,fe[FE_options] Ëæ»úЧӦÃüÁî¸ñʽ£ºxtreg depvar indepvars [if] [in] ,re[FE_options] hausman¼ìÑé¹Ì¶¨Ð§Ó¦»¹ÊÇËæ»úЧӦ£¿ ¡¾Àý×Ó¡¿
xtreg y var1 var2 var3£¬fe est store fe
xtreg y var1 var2 var3£¬re est store re
hausman fe re,sigmamore hausman fe re,sigmaless
*sigmamoreÀûÓÃÓÐЧ¹À¼ÆÁ¿·½²î£¬¼´re *sigmalessÀûÓÃÒ»Ö¹À¼ÆÁ¿·½²î£¬¼´fe
ʮһ£ºStata»Ø¹é½á¹ûµÄµ¼³ö
1¡¢ÔÚÃüÁî´°¿ÚÖÐÊäÈ룺ssc install esttab£¬°²×°ÃüÁî esttab 2¡¢reg »Ø¹é
3¡¢esttab using filename.rtf ½«ÒÔwordÐÎʽÊä³ö»Ø¹é½á¹û£¬ºó׺¸Ä³É.xls»òÕß.csvÔòÒÔExcel¸ñʽÊä³ö£¬Êä³öÄÚÈÝΪ±äÁ¿Ãû³ÆºÍÏàÓ¦µÄ»Ø¹éϵÊý£¬tÖµ£¬ÏÔÖøÐÔˮƽ±êʶ¡£ÏµÍ³Ä¬ÈÏÏÔÖøÐÔˮƽÊÇ0.001£¬0.01ºÍ0.05£¬ÈôÒª¸Ä³É0.01£¬0.05ºÍ0.1£¬ÔòÊä³öesttab m1 m2 using aaa.rtf, star(* 0.10 ** 0.05 *** 0.01)¡£
4¡¢ÅúÁ¿Êä³ö»Ø¹é½á¹û£ºÃ¿ÔËÐÐÒ»¸öregression£¬´æÆðÀ´£ºest store m1¡£m1ÊÇÄãÒª¸ÄµÄ£¬µÚÒ»¸ömodelËùÒÔÎÒ½Ðm1£¬µÚ¶þ¸öµÄ»°Ö¸Áî¾Í±ä³Éest store m2£¬ÒÀ´ÎÀàÍÆ£¬×îºóÔËÐÐÖ¸Áesttab m1 m2 ... using test.rtf¡£
esttab m11111 using aaaaa.rtf, star(* 0.10 ** 0.05 *** 0.01)b(%6.4f)
5¡¢outreg2¿ÉÒÔ½«»Ø¹é½á¹ûµ¼Èëword¡¢excle¡¢latexµÈ£¬¶øÇÒ¿ÉÒÔ¸ù¾Ý×Ô¼ºÐèÒª¸Ä±ä¸ñʽ£º ssc install outreg2 use auto,clear [varlist] est store m1
outreg2 [m1] using test.doc,replace
Ê®¶þ¡¢ºÏ²¢Ñù±¾£¨½«¹Ø¼ü´ÊÏàͬµÄ¶à¸öÑù±¾ºÏ²¢ÎªÒ»¸ö£© ÃüÁî¸ñʽ£ºduplicates drop varlist ,force
ÀýÈ罫ͬһÆóÒµÔÚͬһÌì·¢ÉúµÄ¶àÆð²¢¹ººÏΪһÆ𣬿ɸù¾Ý֤ȯ´úÂëºÍ¹«¸æÈÕÆڹؼü´Ê£¬½«ÆäºÏ²¢£¬ÃüÁduplicates drop company_id event_date ,force
Ê®Èý¡¢¾ùÖµt¼ìÑé
ÃüÁî¸ñʽ£ºttest CAR1 == CAR2, unpaired
Ê®ËÄ¡¢ÖÐλÊýZ¼ìÑ飨·Ç²ÎÊýWilcoxonÖȺͼìÑ飩 ÃüÁî¸ñʽ£ºranksum var, by(groupvar) groupvarΪ·Ö×é±äÁ¿