mysql - Instagram API 在 mysql 中存储来自标签的图像

标签 mysql json api instagram

我正在尝试将数据库中特定标签的所有图像存储在 instagram 上。我的代码调用 API 并加载第一轮图像

$json =  file_get_contents('https://api.instagram.com/v1/tags/hahanotfunny/media/recent?client_id='. $client_id .'&max_tag_id='. $max_tag_id)  ;

一旦浏览完响应的第一页,它就会通过抓取响应分页部分中的链接来检查是否还有另一页。它还会检查“max_tag_id”是否有值并更新我的数据库中的“max”值。我有一个最大值的原因是当我收到“实时”的回复说有新图像时,我会从最后一个最大时间开始下载它们。但是,我的代码有问题。如果我们在响应的最后一页(没有更多的分页链接),那么就没有“max_tag_id”变量,所以数据库没有更新。因此,下次我的爬虫运行时,它会从最后一个已知的“max_tag_id”开始,这会导致最后一页上的重复图像被记录在我的数据库中。

所以,我的问题是,当我收到另一个“实时”警报,说新图像可用于特定标签时,我如何从数据库中存储的最后一条记录开始找到它们?

$dbConnection = new PDO('mysql:dbname=XXXXXXX;host=127.0.0.1;charset=utf8', 'XXXXXXXXX', 'XXXXX');
$dbConnection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$dbConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

function getMax() {
global $dbConnection;

$tag = 'hahanotfunny';
$selectTotals = $dbConnection->prepare("SELECT * FROM instagram_time WHERE tag = :tag LIMIT 1");
$selectTotals->execute(array(':tag' => $tag));

foreach ($selectTotals as $time) {
    $max = $time['max'];    
}

return $max;
}

function updateMax($data) {
global $dbConnection;

$tag = 'hahanotfunny';
$selectTotals = $dbConnection->prepare("UPDATE instagram_time SET max = :maxid WHERE tag = :tag LIMIT 1");
$selectTotals->execute(array(':maxid' => $data, ':tag' => $tag));
}

function fetchData() {
global $dbConnection, $client_id;

$max_tag_id = getMax();
$json =  file_get_contents('https://api.instagram.com/v1/tags/hahanotfunny/media/recent?client_id='. $client_id .'&max_tag_id='. $max_tag_id)  ;
$data = json_decode($json);

$next_max = $data->pagination->next_max_tag_id;

foreach ($data->data as $insta) {
    echo '<br/><img  src="'.$insta->images->low_resolution->url.'"/>';

}

foreach ($data as $object) {
    if ( is_array( $object ) ) {
        foreach ( $object as $media ) {
            $url = $media->images->standard_resolution->url;
            $m_id = $media->id;
            $c_time = $media->created_time;
            $user = $media->user->username;
            $filter = $media->filter;
            $comments = $media->comments->count;
            $caption = $media->caption->text;
            $link = $media->link;
            $low_res=$media->images->low_resolution->url;
            $thumb=$media->images->thumbnail->url;
            $lat = $media->location->latitude;
            $long = $media->location->longitude;
            $loc_id = $media->location->id;

            $data = array(
                'media_id' => $m_id,
                'min_id' => $next_min_id,
                'url' => $url,
                'c_time' => $c_time,
                'user' => $user,
                'filter' => $filter,
                'comment_count' => $comments,
                'caption' => $caption,
                'link' => $link,
                'low_res' => $low_res,
                'thumb' => $thumb,
                'lat' => $lat,
                'long' => $long,
                'loc_id' => $loc_id,
            );

            $selectTotals = $dbConnection->prepare("INSERT INTO instagram_mg (media_id, min_id, url, c_time, user, filter, comment_count, caption, link, low_res_link, thumb, latitude, longitude, loc_id) VALUES (:mediaid, :minid, :url, :ctime, :user, :filter, :commentcount, :caption, :link, :lowreslink, :thumb, :latitude, :longitude, :locid)");

            $selectTotals->execute(array(':mediaid' => $data['media_id'], ':minid' => $data['min_id'], ':url' => $data['url'], ':ctime' => $data['c_time'], ':user' => $data['user'], ':filter' => $data['filter'], ':commentcount' => $data['comment_count'], ':caption' => $data['caption'], ':link' => $data['link'], ':lowreslink' => $data['low_res'], ':thumb' => $data['thumb'], ':latitude' => $data['lat'], ':longitude' => $data['long'], ':locid' => $data['loc_id']));


        }
    }
}


if (isset($next_max)) {
    echo $next_max . "</br>";
    updateMax($next_max);
    fetchData();
} else {
    //$current_time = time();
    //updateMax($current_time); // i tried making the current time the "max_tag_id" but it wouldnt work. 

}


} //fetchData()


fetchData();

最佳答案

就个人而言,我会使用数据库触发器和附带的函数来检查重复项。来自 mysql.com:

A trigger is defined to activate when an INSERT, DELETE, or UPDATE statement executes for the associated table. A trigger can be set to activate either before or after the triggering statement. For example, you can have a trigger activate before each row that is inserted into a table or after each row that is updated.

例如:

CREATE TRIGGER insert_check BEFORE INSERT ON instagram_mg
FOR EACH ROW
BEGIN
    *pseudo-code from here cause I don't know the exacts of mySQL functions*
    if new.media_id is equal to an existing record's media_id, 
        then return false, 
    otherwise, insert the row as normal.
END;

抱歉,我不能说得更具体。我主要处理 postgreSQL 函数,我知道语法略有不同。这是 mySQL 触发器语法的链接:http://dev.mysql.com/doc/refman/5.0/en/trigger-syntax.html

希望对您有所帮助。干杯!

关于mysql - Instagram API 在 mysql 中存储来自标签的图像,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14545009/

相关文章:

java - 使用 PreparedStatement (JDBC) 时如何打印使用过的 SQL 查询

java - 将同名不同值 JSON 对象合并为唯一的一个

java - Jackson - 将接口(interface)反序列化为枚举

php - Google Analytics 数据导出 API V3

angular - 如何在删除项目后更新 HTML5 表格

api - 未找到 Tymon JWTAuth 类

mysql - 我把 mysql 恢复文件放在哪里?

php - 指定页面的某些部分被解释为纯文本

php - 为什么我使用 ipv6 从本地 php 应用程序引擎开发连接到云 sql 失败?

java - 使用不同的类将 JSON 数据与 GSON 映射