python - 统计模型 : simulate data and run simple linear regression

标签 python statsmodels

我是 python statsmodels 包的新手。我正在尝试模拟一些与 log(x) 线性相关的数据,并使用 statsmodels 公式界面运行简单的线性回归。以下是代码:

import pandas as pd
import numpy as np
import statsmodels.formula.api as smf

B0 = 3
B1 = 0.5
x = np.linspace(10, 1e4, num = 1000)
epsilon = np.random.normal(0,3, size=1000)

y=B0 + B1*np.log(x)+epsilon
df1 = pd.DataFrame({'Y':y, 'X':x})

model = smf.OLS ('Y~np.log(X)', data=df1).fit()

我得到以下错误:

ValueError                                Traceback (most recent call last)
<ipython-input-34-c0ab32ca2acf> in <module>()
      7 y=B0 + B1*np.log(X)+epsilon
      8 df1 = pd.DataFrame({'Y':y, 'X':X})
----> 9 smf.OLS ('Y~np.log(X)', data=df1)

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/regression/linear_model.py in __init__(self, endog, exog, missing, hasconst, **kwargs)
    689                  **kwargs):
    690         super(OLS, self).__init__(endog, exog, missing=missing,
--> 691                                   hasconst=hasconst, **kwargs)
    692         if "weights" in self._init_keys:
    693             self._init_keys.remove("weights")

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/regression/linear_model.py in __init__(self, endog, exog, weights, missing, hasconst, **kwargs)
    584             weights = weights.squeeze()
    585         super(WLS, self).__init__(endog, exog, missing=missing,
--> 586                                   weights=weights, hasconst=hasconst, **kwargs)
    587         nobs = self.exog.shape[0]
    588         weights = self.weights

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/regression/linear_model.py in __init__(self, endog, exog, **kwargs)
     89     """
     90     def __init__(self, endog, exog, **kwargs):
---> 91         super(RegressionModel, self).__init__(endog, exog, **kwargs)
     92         self._data_attr.extend(['pinv_wexog', 'wendog', 'wexog', 'weights'])
     93 

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/base/model.py in __init__(self, endog, exog, **kwargs)
    184 
    185     def __init__(self, endog, exog=None, **kwargs):
--> 186         super(LikelihoodModel, self).__init__(endog, exog, **kwargs)
    187         self.initialize()
    188 

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/base/model.py in __init__(self, endog, exog, **kwargs)
     58         hasconst = kwargs.pop('hasconst', None)
     59         self.data = self._handle_data(endog, exog, missing, hasconst,
---> 60                                       **kwargs)
     61         self.k_constant = self.data.k_constant
     62         self.exog = self.data.exog

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/base/model.py in _handle_data(self, endog, exog, missing, hasconst, **kwargs)
     82 
     83     def _handle_data(self, endog, exog, missing, hasconst, **kwargs):
---> 84         data = handle_data(endog, exog, missing, hasconst, **kwargs)
     85         # kwargs arrays could have changed, easier to just attach here
     86         for key in kwargs:

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/base/data.py in handle_data(endog, exog, missing, hasconst, **kwargs)
    562         exog = np.asarray(exog)
    563 
--> 564     klass = handle_data_class_factory(endog, exog)
    565     return klass(endog, exog=exog, missing=missing, hasconst=hasconst,
    566                  **kwargs)

/Users/tiger/anaconda/lib/python3.5/site-packages/statsmodels/base/data.py in handle_data_class_factory(endog, exog)
    551     else:
    552         raise ValueError('unrecognized data structures: %s / %s' %
--> 553                          (type(endog), type(exog)))
    554     return klass
    555 

ValueError: unrecognized data structures: <class 'str'> / <class 'NoneType'>

我检查了文档,一切似乎都是正确的。花了很长时间试图理解为什么我会收到这些错误但无法弄清楚。非常感谢您的帮助。

最佳答案

在 statsmodels.formula.api 中,ols 方法是小写的。 在 statsmodels.api 中,OLS 全部大写。 在您的情况下,您需要...

model = smf.ols('Y~np.log(X)', data=df1).fit()

关于python - 统计模型 : simulate data and run simple linear regression,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42005041/

相关文章:

Python 更改多直方图上的轴

python - 有没有一种 pythonic 方法来获取可迭代中相同值的簇的开始和结束索引?

r - 为什么 R 和 statsmodels 给出的方差分析结果略有不同?

python - 如何通过 OLS 回归输出摘要检测 python 中的特定警告

python - statsmodels.api.tsa.get_forcast 的参数是什么?

python - sklearn 中的 SVM 是否支持增量(在线)学习?

python - 如何在交互式 python 中更新源文件(带有类)

python 统计模型 : tukey HSD plot not working

python - 如何将残差转换为原始值 Python statsmodels

python - numpy.diff 返回一个空数组?