python - Python连接MYSQL,获取SP500符号

标签 python mysql web-scraping beautifulsoup

代码如下:

#!/usr/bin/python
# -*- coding: utf-8 -*-

# insert_symbols.py
#table name should be 's_master', password should be '*******', user name should be 's_user'

from __future__ import print_function

import datetime
from math import ceil

import bs4
import MySQLdb as mdb
import requests


def obtain_parse_wiki_snp500():
    """
    Download and parse the Wikipedia list of S&P500 
    constituents using requests and BeautifulSoup.

    Returns a list of tuples for to add to MySQL.
    """
    # Stores the current time, for the created_at record
    now = datetime.datetime.utcnow()

    # Use requests and BeautifulSoup to download the 
    # list of S&P500 companies and obtain the symbol table
    response = requests.get(
        "http://en.wikipedia.org/wiki/List_of_S%26P_500_companies"
    )
   # soup = bs4.BeautifulSoup(response.text) ///this is the original version, i replaced with the code below
    soup = bs4.BeautifulSoup(response.text, "html.parser")

    # This selects the first table, using CSS Selector syntax
    # and then ignores the header row ([1:])
    symbolslist = soup.select('table')[0].select('tr')[1:]

    # Obtain the symbol information for each 
    # row in the S&P500 constituent table
    symbols = []
    for i, symbol in enumerate(symbolslist):
        tds = symbol.select('td')
        symbols.append(
            (
                tds[0].select('a')[0].text,  # Ticker
                'stock', 
                tds[1].select('a')[0].text,  # Name
                tds[3].text,  # Sector
                'USD', now, now
            ) 
        )
    return symbols


def insert_snp500_symbols(symbols):
    """
    Insert the S&P500 symbols into the MySQL database.
    """
    # Connect to the MySQL instance
    db_host = 'localhost'
    db_user = 's_user'
    db_pass = '*******'
    db_name = 's_master'
    con = mdb.connect(
        host=db_host, user=db_user, passwd=db_pass, db=db_name
    )

    # Create the insert strings
    column_str = """ticker, instrument, name, sector, 
                 currency, created_date, last_updated_date
                 """
    insert_str = ("%s, " * 7)[:-2]
    final_str = "INSERT INTO symbol (%s) VALUES (%s)" % \
        (column_str, insert_str)

    # Using the MySQL connection, carry out 
    # an INSERT INTO for every symbol
    with con: 
        cur = con.cursor()
        cur.executemany(final_str, symbols)


if __name__ == "__main__":
    symbols = obtain_parse_wiki_snp500()
    insert_snp500_symbols(symbols)
    print("%s symbols were successfully added." % len(symbols))

我收到的错误消息如下:

Traceback (most recent call last):
  File "insert_symbols.py", line 86, in <module>
    insert_snp500_symbols(symbols)
  File "insert_symbols.py", line 81, in insert_snp500_symbols
    cur.executemany(final_str, symbols)
  File "/home/haolun/.local/lib/python2.7/site-packages/MySQLdb/cursors.py", line 285, in executemany
    r = self._query(qs)
  File "/home/haolun/.local/lib/python2.7/site-packages/MySQLdb/cursors.py", line 378, in _query
    rowcount = self._do_query(q)
  File "/home/haolun/.local/lib/python2.7/site-packages/MySQLdb/cursors.py", line 341, in _do_query
    db.query(q)
  File "/home/haolun/.local/lib/python2.7/site-packages/MySQLdb/connections.py", line 280, in query
    _mysql.connection.query(self, query)
_mysql_exceptions.ProgrammingError: (1146, "Table 's_master.symbol' doesn't exist")

我不知道为什么我收到这个 _"mysql_exceptions.ProgrammingError: (1146, "Table 's_master.symbol' does not Existence") 消息。我确实创建了 s_master 表。beautifulsoup 有什么问题吗赞扬还是其他?

最佳答案

我认为你没有正确创建表。我编写了一个sql文件init.sql,并使用以下命令将其导入到mysql数据库:

mysql -u root -p < init.sql

这是init.sql的内容

create database if not exists s_master default charset utf8 collate utf8_unicode_ci;
-- chage the password to 123456
grant all on s_master.* to s_user@'localhost' identified by '123456';
use s_master;
create table if not exists symbol(
    ticker varchar(40),
    instrument varchar(40),
    name varchar(40),
    sector varchar(40),
    currency varchar(40),
    created_date datetime,
    last_updated_date datetime
);

之后我能够正确运行你的程序,程序的结果是:

➜ /tmp [sof]$ python yourprogram.py
(u'MMM', 'stock', u'3M Company', u'Industrials', 'USD', datetime.datetime(2016, 9, 26, 5, 24, 11, 62336), datetime.datetime(2016, 9, 26, 5, 24, 11, 62336))
505 symbols were successfully added.
➜ /tmp [sof]$ 

关于python - Python连接MYSQL,获取SP500符号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39694339/

相关文章:

perl - 使用Perl抓取网站

php - CasperJS 将数据传回 PHP

python - 使用 pandas read_csv 函数将 'true' 保留为字符串,将 'True' 保留为 bool 值

python - 如何知道QTableWidget的单元格是否正在被编辑?

MySQL 查询更新除最新行之外的所有 "duplicate"行

mysql - key '...' 的重复条目 'PRIMARY'

javascript - 使用 Node.js MySQL 库的正确方法是什么?

python - 我想使用 scrapy python 单击网站链接

python - 根据下拉列表和日期选择器为某个唯一值在破折号图中创建图表

Python、ProjectEuler、优化使代码变慢