我对 Stata 还很陌生。
我有一组“国家 GDP 年”形式的观察结果。我想创建一个新变量GDP1960,它给出每个国家1960年每年的GDP:
USA $100m 1960 USA $100m 1960 $100m
USA $200m 1965 --> USA $200m 1965 $100m
Canada $60m 1960 Canada $60m 1960 $60m
实现这一点的正确语法是什么? (我假设egen
以某种神秘的方式参与其中)
最佳答案
您已找到解决方案 cond()
,但这里有一些建议,可以使您的数据建模更容易,并帮助您避免通过创建 rank
进行排序时可能出现的问题。变量(我已经得到了您在下面询问的 egen
解决方案):
将下面的代码粘贴到您的 do 文件编辑器中并运行它:
*---------------------------------BEGIN EXAMPLE
clear
inp str20 country str10 gdp year
"USA" "$100m" 1960
"USA" "$200m" 1965
"Canada" "$60m" 1960
"Canada" "$120m" 1965
"USA" "$250m" 1970
"Mexico" "$90m" 1970
"Canada" "$800m" 1970
"Mexico" "$160m" 1960
"Mexico" "$220m" 1965
"Mexico" "$350m" 1975
end
//1. destring gdp so that we can work with it
destring gdp, ignore("$", "m") replace
//2. Create GDP for 1960 var:
bys country: g x = gdp if year==1960
bys country: egen gdp60 = max(x)
drop x
**you could also create balanced panels to see gaps in your data**
preserve
ssc install panels
panels country year
fillin country year
li //take a look at the results win. to see how filled panel data would look
restore
//3. create a gdp variable for each year (reshape the dataset)
drop gdp60
reshape wide gdp, i(country) j(year)
**much easier to use this format for modeling
su gdp1970
**here's a fake "outcome" or response variable to work with**
g outcome = 500+int((1000-500+1)*runiform())
anova outcome gdp1960-gdp1970 //or whatever makes sense for your situation
*---------------------------------END EXAMPLE
关于stata - 根据不同年份的 GDP 变量创建 "GDP in 1960"变量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2745202/