Stataͳ¼Æ·ÖÎöÃüÁî ÏÂÔر¾ÎÄ

Stataͳ¼Æ·ÖÎö³£ÓÃÃüÁî»ã×Ü

Ò»¡¢winsorize¼«¶ËÖµ´¦Àí

·¶Î§£ºÒ»°ãÔÚ1%ºÍ99%·Öλ×ö¼«¶ËÖµ´¦Àí£¬¶ÔÓÚСÓÚ1%µÄÊýÓÃ1%µÄÖµ¸³Öµ£¬¶ÔÓÚ´óÓÚ99%µÄÊýÓÃ99%µÄÖµ¸³Öµ¡£

1¡¢StataÖеĵ¥±äÁ¿¼«¶ËÖµ´¦Àí£º

stata 11.0£¬ÔÚÃüÁî´°¿ÚÊäÈë¡°findit winsor¡±ºó£¬ÏµÍ³µ¯³öÒ»¸ö´°¿Ú£¬°²×°winsorÄ£¿é °²×°ºÃÄ£¿éÖ®ºó£¬¾Í¿ÉÒÔµ÷ÓÃwinsorÃüÁÃüÁî¸ñʽ£ºwinsor var1, gen(new var) p(0.01) »òÕßÔÚÃüÁî´°¿ÚÖÐÊäÈ룺ssc install winsor°²×°winsorÃüÁî¡£winsorÃüÁî²»ÄܽøÐÐÅúÁ¿´¦Àí¡£

2¡¢ÅúÁ¿½øÐÐwinsorize¼«¶ËÖµ´¦Àí£º

´ò¿ªÁ´½Ó£ºhttp://personal.anderson.ucla.edu/judson.caskey/data.html£¬ÕÒµ½winsorizeJ£¬µã»÷ÓÒ¼ü£¬Áí´æΪµ½stataÖеÄado/plus/Ŀ¼Ï¼´¿É¡£ÃüÁî¸ñʽ£ºwinsorizeJ var1var2var3,suffix(w)¼´¿É£¬ÕâÑù»áÉú³ÉÈý¸öбäÁ¿£¬var1w var2w var3w£¬¶øÇÒĬÈϵÄÊÇÉÏÏÂ1%winsorize¡£Èç¹ûÒªÐ޸ķÖλµã£¬Ôòд³ÉÈçϸñʽ£ºwinsorizeJ var 1 var2 var3,suffix(w) cuts(5 95)¡£ 3¡¢ExcelÖеļ«¶ËÖµ´¦Àí£º£¨ÂÔ£© winsor2 ÃüÁîʹÓÃ˵Ã÷

¼ò½é£ºwinsor2 winsorize or trim (if trim option is specified) the variables in varlist at particular percentiles specified by option cuts(# #). In defult, new variables will be generated with a suffix \variables with their winsorized or trimmed ones.

Ïà±ÈÓÚwinsorÃüÁîµÄ¸Ä½ø£º (1) ¿ÉÒÔÅúÁ¿´¦Àí¶à¸ö±äÁ¿£»

(2) ²»½ö¿ÉÒÔ winsor£¬Ò²¿ÉÒÔ trimming£»

(3) ¸½¼ÓÁË by() Ñ¡Ï¿ÉÒÔ·Ö×é winsor »ò trimming£»

(4) Ôö¼ÓÁË replace Ñ¡Ï¿ÉÒÔ²»±ØÉú³ÉбäÁ¿£¬Ö±½ÓÌæ»»Ô­±äÁ¿¡£

·¶Àý£º

*- winsor at (p1 p99), get new variable \ . sysuse nlsw88, clear . winsor2 wage

*- left-trimming at 2th percentile . winsor2 wage, cuts(2 100) trim

*- winsor variables by (industry south), overwrite the old variables . winsor2 wage hours, replace by(industry south)

ʹÓ÷½·¨:

1. Ç뽫 winsor2.ado ºÍ winsor2.sthlp ·ÅÖÃÓÚ stata12\\ado\\base\\w Îļþ¼ÐÏ£» 2. ÊäÈë help winsor2 ¿ÉÒԲ鿴°ïÖúÎļþ£»

¶þ¡¢ÃèÊöÐÔͳ¼Æ

1¡¢summarize

ÃüÁî¸ñʽ£ºsu¡¢sum»òÕßsummarize [varlist] [if] [in] [weight] [,options]

Èç¹ûsummarize»òsumºó²»¼ÓÈκαäÁ¿£¬ÔòĬÈ϶ÔÊý¾ÝÖеÄËùÓбäÁ¿½øÐÐÃèÊöͳ¼Æ options Ñ¡Ïdetail ±íʾ²úÉú¸ü¼ÓÏêϸµÄͳ¼Æ±äÁ¿

Separator£¨n£©±íʾÿn¸ö±äÁ¿»­Ò»Ìõ·Ö½çÏߣ¬n=0±íʾ½ûֹʹÓ÷ֽçÏß

Summarize ÃèÊöͳ¼ÆÊä³ö±íÖаüº¬£ºÑù±¾ÈÝÁ¿¡¢Æ½¾ùÊý¡¢±ê×¼²î¡¢×îСֵºÍ×î´óÖµ 2¡¢tabstat

ÃüÁî¸ñʽ£ºtabstat [varlist] [if] [in] [weight] [,options]

options Ñ¡Ïstat(statname) ±íʾÉ趨ËùÐèÒªµÄͳ¼ÆÁ¿ col(stat)»òc(s)±íʾ½«½á¹û±¨±íתÖà ͳ¼ÆÁ¿£º

mean£ºÆ½¾ùÊý count/n£º¹Û²âÖµÊýÄ¿ sum£º¼Ó×Ü

max/min £º×î´óÖµ/×îСֵ range £º¼«²î sd£º±ê×¼²î cv£º±äÒìϵÊý semean £ºÆ½¾ù±ê×¼Îó²î skewness£ºÆ«¶Èvar £º·½²î

kurtosis £º·å¶È median/p50£ºÖÐλÊý p# £º#%°Ù·ÖλÊý ÀýÈ磺tabstat[varlist],stat(count mean sd median min max range) col(stat) 3¡¢ÃèÊöÐÔͳ¼Æ½á¹ûÊä³öµ½word»òExcel

ÓÃsum×öµÄÃèÊöÐÔͳ¼Æ£ºlogout, save(miaoshutongji) word replace:sum

ÓÃtabstat×öµÄÃèÊöÐÔͳ¼Æ£ºlogout, save(miaoshutongji) word replace:tabstat [varlist] ,stat(count mean sd median min max range) col(stat) ·Ö×éÃèÊö£ºbysort var:

Èý¡¢Ïà¹ØÐÔ·ÖÎö

£¨Ò»£©Ïà¹ØÐÔ·ÖÎö 1¡¢PearsonÏà¹ØϵÊýÃüÁî¸ñʽ£ºcorrelate£¨¼òд£ºcor»òcorr£©[varlist] [if] [in] [weight] [,options] 2¡¢spearmanÏà¹ØϵÊýÃüÁî¸ñʽ£ºspearman[varlist], stats(rho p)

3¡¢ÔÚStataÖУ¬ÃüÁîcorrÓÃÓÚ¼ÆËãÒ»×é±äÁ¿¼äµÄЭ·½²î»òÏà¹ØϵÊý¾ØÕó£»

4¡¢ÃüÁîpwcorr¿ÉÓÃÓÚ¼ÆËãÒ»×é±äÁ¿ÖÐÁ½Á½±äÁ¿µÄÏà¹ØϵÊý£¬Í¬Ê±»¹¿ÉÒÔ¶ÔÏà¹ØϵÊýµÄÏÔÖøÐÔ½øÐмìÑ飻optionÑ¡ÏîÖмÓÉÏsig¿ÉÏÔʾÏÔÖøÐÔˮƽ£ºpwcorr[varlist] ,sig

5¡¢ÃüÁîpcorr ÓÃÓÚ¼ÆËãÒ»×é±äÁ¿ÖÐÁ½Á½±äÁ¿µÄÆ«Ïà¹ØϵÊý²¢½øÐÐÏÔÖøÐÔ¼ìÑé¡£ 6¡¢Spearman ºÍ Pearson ¼ìÑéͬÔÚÒ»¸ö±íµÄÃüÁcorrtbl[varlist] ,corrvars ([varlist])

Êä³ö½á¹ûÖУ¬ÉÏÈý½ÇΪSpearmanÏà¹ØϵÊýºÍÏÔÖøˮƽ£¬ÏÂÈý½ÇΪPearsonϵÊýºÍÏÔÖøˮƽ¡£ £¨¶þ£©Êä³öÏà¹ØϵÊý±íµ½word»òExcelÖÐ

ÀýÈ磺logout, save(mytable) word replace: pwcorr_a price mpg rep78 headroom trunk, star1(0.01) star5(0.05) star10(0.1)

ËÄ¡¢½ØÃæÊý¾Ýµ¥·½³ÌÏßÐԻعéÄ£Ð͵ÄStataʵÏÖ

ÃüÁî¸ñʽ£ºregress£¨¼òд£ºreg£©depvar indepvars [if] [in] [weigh] [option] £¨depvar±íʾÒò±äÁ¿£¬ indepvars±íʾ×Ô±äÁ¿£©

Îå¡¢Òì·½²îµÄ¼ìÑéÓë´¦Àí

1¡¢¼ìÑéÒì·½²îÃüÁî¸ñʽ£ºhettest 2¡¢ÅжÏÒì·½²îµÄ±ê×¼£º

¿´PÖµµÄ´óСÀ´Åжϣ¬Èç¹ûPֵСÓÚ0.05£¬Ôò²»ÄÜÅųýÒì·½²îµÄ¿ÉÄÜ£¬ÉÏͼÖÐPÖµµÈÓÚ0.4584>0.05£¬Òò´Ë£¬¿ÉÒÔÅųýÒì·½²îµÄ¿ÉÄÜÐÔ¡£

3¡¢´¦ÀíÒì·½²îÃüÁî¸ñʽ£ºÔÚregÃüÁîºó¼ÓÉÏ¡°,r¡±»òÕß¡°,robust¡±¼´¿É¡£¾­Òì·½²î´¦ÀíºóµÄ»Ø¹é²»ÏÔʾµ÷ÕûºóµÄR2£¨adj-R2£©£¬Èç¹ûÒª²é¿´µ÷ÕûºóµÄR2£¬ÔÙÊäÈëÃüÁdi e(r2_a)

Áù¡¢¶àÖع²ÏßÐÔ£¨×Ô±äÁ¿Ö®¼ä¸ß¶ÈÏà¹Ø£©ÃüÁî¸ñʽ£ºvif

£¨Ò»£©Åж϶àÖع²ÏßÐԵıê×¼£¨Á½¸ö±ê×¼±ØÐëͬʱÂú×㣩£º 1¡¢×î´óµÄvif´óÓÚ10£» 2¡¢Æ½¾ùµÄvif´óÓÚ1 ¡£ £¨¶þ£©¶àÖع²ÏßÐÔµÄÐÞÕý

1¡¢²ÉÓÃÖ𲽻عé½øÐÐÐÞÕý£¬ÃüÁî¸ñʽ£ºsw reg depvar indepvar, pr(0.05)

2¡¢¶ÔÓÚº¬¶þ´ÎÏîµÄ£¬Ê¹Óá°¶ÔÖС±µÄ·½·¨£¬¼È¿ÉÒÔ±£Áô¶þ´ÎÏÓÖ¿ÉÒÔÔÚÒ»¶¨³Ì¶ÈÉÏ¿Ë·þ¶àÖع²ÏßÐÔµÄÎÊÌ⣺Ïȶ¨ÒåÁ½¸ö±äÁ¿£¬·Ö±ðΪ¸Ã±äÁ¿¼õÈ¥Æä¾ùÖµºÍ¸Ã±äÁ¿µÄƽ·½£¬ÃüÁîÈçÏ£º sum var

gen var1=var-r(mean) gen var2=var^2

ÔÙÓÃбäÁ¿´úÌæÔ­À´µÄ±äÁ¿½øÐлع鴦Àí

Æß¡¢ÄÚÉúÐԵļìÑéÓë´¦Àí£¨ÄÚÉúÐÔÊÇÖ¸×Ô±äÁ¿ÓëÎó²îÏîÖ®¼äÓйØϵ£©

1¡¢ÄÚÉúÐԵļìÑ飺ovtest

¿´PÖµµÄ´óСÀ´Åжϣ¬Èç¹ûPֵСÓÚ0.05£¬Ôò²»ÄÜÅųýÄÚÉúÐԵĿÉÄÜ£¬ÉÏͼÖÐPÖµµÈÓÚ0.4717>0.05£¬Òò´Ë£¬¿ÉÒÔÅųýÄÚÉúÐԵĿÉÄÜ¡£ 2¡¢ÄÚÉúÐԵĴ¦Àí£ºÊ¹Óù¤¾ß±äÁ¿·¨£ºivreg

ÄÚÉúÐÔµÄÈý¸öÀ´Ô´£º²âÁ¿Îó²î¡¢ÒÅ©±äÁ¿ºÍË«ÏòÒò¹û¡£ 1¡¢±äÁ¿µÄÄÚÉúÐÔ¡£

Õâ¸öÊÇûÓа취µ¥¶À¼ìÑéµÄ¡£µ±ÓкÏÊʹ¤¾ß±äÁ¿Ê±ºò£¬ÊÇ¿ÉÒÔ¼ìÑéµÄ£¬¾ÍÊÇhausman¼ìÑé

2¡¢¹¤¾ß±äÁ¿µÄÍâÉúÐÔ¡£

Õâ¸öÒ²ÊÇû°ì·¨¼ìÑéµÄ¡£µ±Óкܶ๤¾ß±äÁ¿Ê±ºò£¬¿ÉÒÔ¼ìÑéÊÇ·ñÓв»ÊÇÍâÉúµÄ£¬¾ÍÊÇ¡°¹ý¶Èʶ±ð¡±ÎÊÌâ

3¡¢¹¤¾ß±äÁ¿µÄÏà¹ØÐÔ¡£

Õâ¸ö¿ÉÒÔ˵³ÉÊÇ¡°Èõ¹¤¾ß±äÁ¿¡±ÎÊÌ⣬¼ìÑé¿ÉÒÔͨ¹ýÒ»½×¶ÎµÄFÖµ¡£»¹¿ÉÒÔÀûÓÃPartial R2¡£ 4¡¢¹À¼Æ·½·¨

stataÀïÃæÓÐÕâô¼¸¸ö2sls£¬2sls smal¡¢liml¡¢gmm£¬¸÷×ÔÊÊÓÃÇé¿ö£ºsmallÊʺÏСÑù±¾£»limlÊʺÏÈõ¹¤¾ß±äÁ¿£»gmmÊʺÏÒì·½²î¡£ ¡¾Àý×Ó¡¿ webuse hsng2

*Fit a regression via 2SLS, requesting small-sample statistics ivregress 2sls rent pcturban (hsngval = faminc iregion), small *Fit a regression using the LIML estimator

ivregress liml rent pcturban (hsngval = faminc iregion)

*Fit a regression via GMM using the default heteroskedasticity-robust weight matrix ivregress gmm rent pcturban (hsngval = faminc iregion)

*Fit a regression via GMM using a heteroskedasticity-robust weight matrix, requesting nonrobust standard errors

ivregress gmm rent pcturban (hsngval = faminc iregion), vce(unadjusted) *¼ìÑé

estata firststage ,all forcenonrobust \\\\\\¿ÉÒԲ鿴µÚÒ»½×¶ÎFÖµ£¬ÒѾ­partial R2 estat overid \\\\\\²é¿´ÊÇ·ñ¹ý¶Èʶ±ð estat endogenous \\\\\\²é¿´ÊÇ·ñÒì·½²î

regress 2sls rent pcturban hsngval est store m1

ivregress 2sls rent pcturban (hsngval = faminc iregion) est store m2

hausman m1 m2 \\\\\\ÄÚÉú¼ìÑé

°Ë¡¢ÏßÐÔ·½³Ì×éµÄ»Ø¹é·ÖÎö

ÃüÁî¸ñʽ£ºsureg(depvar1 varlist1)(depvar2 varlist2)¡­(depvarN varlistN) [if] [in] [weigh]

¾Å¡¢ÁªÁ¢·½³Ì×é

ÃüÁî¸ñʽ£ºreg3 (depvar1 varlist1)(depvar2 varlist2)¡­(depvarN varlistN) [if] [in] [weigh]

Ê®¡¢Ãæ°åÊý¾ÝµÄ¹Ì¶¨Ð§Ó¦ºÍËæ»úЧӦ Xtset

¹Ì¶¨Ð§Ó¦ÃüÁî¸ñʽ£ºxtreg depvar indepvars [if] [in] ,fe[FE_options] Ëæ»úЧӦÃüÁî¸ñʽ£ºxtreg depvar indepvars [if] [in] ,re[FE_options] hausman¼ìÑé¹Ì¶¨Ð§Ó¦»¹ÊÇËæ»úЧӦ£¿ ¡¾Àý×Ó¡¿

xtreg y var1 var2 var3£¬fe est store fe

xtreg y var1 var2 var3£¬re est store re

hausman fe re,sigmamore hausman fe re,sigmaless

*sigmamoreÀûÓÃÓÐЧ¹À¼ÆÁ¿·½²î£¬¼´re *sigmalessÀûÓÃÒ»Ö¹À¼ÆÁ¿·½²î£¬¼´fe

ʮһ£ºStata»Ø¹é½á¹ûµÄµ¼³ö

1¡¢ÔÚÃüÁî´°¿ÚÖÐÊäÈ룺ssc install esttab£¬°²×°ÃüÁî esttab 2¡¢reg »Ø¹é

3¡¢esttab using filename.rtf ½«ÒÔwordÐÎʽÊä³ö»Ø¹é½á¹û£¬ºó׺¸Ä³É.xls»òÕß.csvÔòÒÔExcel¸ñʽÊä³ö£¬Êä³öÄÚÈÝΪ±äÁ¿Ãû³ÆºÍÏàÓ¦µÄ»Ø¹éϵÊý£¬tÖµ£¬ÏÔÖøÐÔˮƽ±êʶ¡£ÏµÍ³Ä¬ÈÏÏÔÖøÐÔˮƽÊÇ0.001£¬0.01ºÍ0.05£¬ÈôÒª¸Ä³É0.01£¬0.05ºÍ0.1£¬ÔòÊä³öesttab m1 m2 using aaa.rtf, star(* 0.10 ** 0.05 *** 0.01)¡£

4¡¢ÅúÁ¿Êä³ö»Ø¹é½á¹û£ºÃ¿ÔËÐÐÒ»¸öregression£¬´æÆðÀ´£ºest store m1¡£m1ÊÇÄãÒª¸ÄµÄ£¬µÚÒ»¸ömodelËùÒÔÎÒ½Ðm1£¬µÚ¶þ¸öµÄ»°Ö¸Áî¾Í±ä³Éest store m2£¬ÒÀ´ÎÀàÍÆ£¬×îºóÔËÐÐÖ¸Áesttab m1 m2 ... using test.rtf¡£

esttab m11111 using aaaaa.rtf, star(* 0.10 ** 0.05 *** 0.01)b(%6.4f)

5¡¢outreg2¿ÉÒÔ½«»Ø¹é½á¹ûµ¼Èëword¡¢excle¡¢latexµÈ£¬¶øÇÒ¿ÉÒÔ¸ù¾Ý×Ô¼ºÐèÒª¸Ä±ä¸ñʽ£º ssc install outreg2 use auto,clear [varlist] est store m1

outreg2 [m1] using test.doc,replace

Ê®¶þ¡¢ºÏ²¢Ñù±¾£¨½«¹Ø¼ü´ÊÏàͬµÄ¶à¸öÑù±¾ºÏ²¢ÎªÒ»¸ö£© ÃüÁî¸ñʽ£ºduplicates drop varlist ,force

ÀýÈ罫ͬһÆóÒµÔÚͬһÌì·¢ÉúµÄ¶àÆð²¢¹ººÏΪһÆ𣬿ɸù¾Ý֤ȯ´úÂëºÍ¹«¸æÈÕÆڹؼü´Ê£¬½«ÆäºÏ²¢£¬ÃüÁduplicates drop company_id event_date ,force

Ê®Èý¡¢¾ùÖµt¼ìÑé

ÃüÁî¸ñʽ£ºttest CAR1 == CAR2, unpaired

Ê®ËÄ¡¢ÖÐλÊýZ¼ìÑ飨·Ç²ÎÊýWilcoxonÖȺͼìÑ飩 ÃüÁî¸ñʽ£ºranksum var, by(groupvar) groupvarΪ·Ö×é±äÁ¿