python - 如何在序列化大量 GeoDjango 几何字段时进行性能优化?

标签 python django gis geojson geodjango

我正在开发一个 GeoDjango 应用程序,它使用教程中提供的 WorldBorder 模型。我还创建了我自己的与 WorldBorder 相关联的区域模型。所以一个 WorldBorder/Country 可以有多个区域,其中也有边界(MultiPolygon 字段)。

我使用 DRF 为它制作了 API,但它太慢了,以 GeoJSON 格式加载所有 WorldBorder 和区域需要 16 秒。不过,返回的 JSON 大小为 10MB。这合理吗?

我什至将序列化程序更改为 serpy这比 DRF GIS 快得多序列化程序,但仅提供 10% 的性能改进。

分析后发现,大部分时间花在 GIS 函数上,将数据库中的数据类型转换为坐标列表而不是 WKT。如果我使用 WKT,序列化速度要快得多(1.7 秒与 11.7 秒相比,WKT 仅适用于 WorldBorder MultiPolygon,其他一切仍在 GeoJson 中)

我还尝试使用具有低容差 (0.005) 的 ST_SimplifyVW 压缩 MultiPolygon 以保持准确性,这将 JSON 大小降低到 1.7 MB。这使得总负载为 3.5s。当然,我仍然可以找到平衡精度和速度的最佳公差。

下面是分析数据(简化的MultiPolygon中查询的突然增加是由于错误使用Django QS API来获取ST_SimplifyVW)

enter image description here

编辑:我修复了数据库查询,因此查询调用在 75 个查询时保持不变,正如预期的那样,它不会显着提高性能。

编辑:我继续改进我的数据库查询。我现在将其减少到只有 8 个查询。正如预期的那样,它并没有提高那么多的性能。

enter image description here

下面是函数调用的分析。我突出显示了花费大部分时间的部分。这是使用 Vanilla DRF GIS 实现。 enter image description here

下面是我在没有 ST_SimplifyVW 的情况下将 WKT 用于 MultiPolygon 字段之一的情况。 enter image description here

这是@Udi 要求的模型

class WorldBorderQueryset(models.query.QuerySet):
    def simplified(self, tolerance):
        sql = "SELECT ST_SimplifyVW(mpoly, %s) AS mpoly"
        return self.extra(
            select={'mpoly': sql},
            select_params=(tolerance,)
        )


class WorldBorderManager(models.Manager):
    def get_by_natural_key(self, name, iso2):
        return self.get(name=name, iso2=iso2)

    def get_queryset(self, *args, **kwargs):
        qs = WorldBorderQueryset(self.model, using=self._db)
        qs = qs.prefetch_related('regions',)
        return qs

    def simplified(self, level):
        return self.get_queryset().simplified(level)


class WorldBorder(TimeStampedModel):
    name = models.CharField(max_length=50)
    area = models.IntegerField(null=True, blank=True)
    pop2005 = models.IntegerField('Population 2005', default=0)
    fips = models.CharField('FIPS Code', max_length=2, null=True, blank=True)
    iso2 = models.CharField('2 Digit ISO', max_length=2, null=True, blank=True)
    iso3 = models.CharField('3 Digit ISO', max_length=3, null=True, blank=True)
    un = models.IntegerField('United Nations Code', null=True, blank=True)
    region = models.IntegerField('Region Code', null=True, blank=True)
    subregion = models.IntegerField('Sub-Region Code', null=True, blank=True)
    lon = models.FloatField(null=True, blank=True)
    lat = models.FloatField(null=True, blank=True)

    # generated from lon lat to be one field so that it can be easily
    # edited in admin
    center_coordinates = models.PointField(blank=True, null=True)

    mpoly = models.MultiPolygonField(help_text='Borders')

    objects = WorldBorderManager()

    def save(self, *args, **kwargs):
        if not self.center_coordinates:
            self.center_coordinates = Point(x=self.lon, y=self.lat)
        super().save(*args, **kwargs)

    def natural_key(self):
        return self.name, self.iso2

    def __str__(self):
        return self.name

    class Meta:
        verbose_name = 'Country'
        verbose_name_plural = 'Countries'
        ordering = ('name',)


class Region(TimeStampedModel):
    name = models.CharField(max_length=100, unique=True)
    country = models.ForeignKey(WorldBorder, related_name='regions')
    mpoly = models.MultiPolygonField(help_text='Areas')
    center_coordinates = models.PointField()

    moment_category = models.ForeignKey('moment.MomentCategory',
                                        blank=True, null=True)

    objects = RegionManager()
    no_joins = models.Manager()

    def natural_key(self):
        return (self.name,)

    def __str__(self):
        return self.name


# TODO might want to have separate table for ActiveCity for performance
# improvement since we have like 50k cities
class City(TimeStampedModel):
    country = models.ForeignKey(WorldBorder, on_delete=models.PROTECT,
                                related_name='cities')
    region = models.ForeignKey(Region, blank=True, null=True,
                               related_name='cities',
                               on_delete=models.SET_NULL)

    name = models.CharField(max_length=255)
    accent_city = models.CharField(max_length=255)
    population = models.IntegerField(blank=True, null=True)
    is_capital = models.BooleanField(default=False)

    center_coordinates = models.PointField()

    # is active marks that this city is a destination
    # only cities with is_active True will be put up to the frontend
    is_active = models.BooleanField(default=False)

    objects = DefaultSelectOrPrefetchManager(
        prefetch_related=(
            'yes_moment_beacons__activity__verb',
            'social_beacons',
            'video_beacons'
        ),
        select_related=('region', 'country')
    )
    no_joins = models.Manager()

    def natural_key(self):
        return (self.name,)

    def __str__(self):
        return self.name

    class Meta:
        verbose_name_plural = 'Cities'

class Beacon(TimeStampedModel):
    # if null defaults to city center coordinates
    coordinates = models.PointField(blank=True, null=True)
    is_fake = models.BooleanField(default=False)

    # can use city here, but the %(class)s gives no space between words
    # and it looks ugly

    def validate_activity(self):
        # activities in the region
        activities = self.city.region.moment_category.activities.all()
        if self.activity not in activities:
            raise ValidationError('Activity is not in the Region')

    def clean(self):
        self.validate_activity()

    def save(self, *args, **kwargs):
        # doing a full clean is needed here is to ensure code correctness
        # (not user),
        # because if someone use objects.create, clean() will never get called,
        # cons is validation will be done twice if the object is
        # created e.g. from admin
        self.full_clean()

        if not self.coordinates:
            self.coordinates = self.city.center_coordinates
        super().save(*args, **kwargs)

    class Meta:
        abstract = True


class YesMomentBeacon(Beacon):
    activity = models.ForeignKey('moment.Activity',
                                 on_delete=models.CASCADE,
                                 related_name='yes_moment_beacons')
    # ..........
    # other fields

    city = models.ForeignKey('world.City', related_name='yes_moment_beacons')

    objects = DefaultSelectOrPrefetchManager(
        select_related=('activity__verb',)
    )

    def __str__(self):
        return '{} - {}'.format(self.activity, self.coordinates)

# other beacon types.......

这是@Udi 要求的我的序列化器

class RegionInWorldSerializer(GeoFeatureModelSerializer):
    yes_moment_beacons = serializers.SerializerMethodField()
    social_beacons = serializers.SerializerMethodField()
    video_beacons = serializers.SerializerMethodField()

    center_coordinates = GeometrySerializerMethodField()

    def get_center_coordinates(self, obj):
        return obj.center_coordinates

    def get_yes_moment_beacons(self, obj):
        count = 0

        # don't worry, it's already prefetched in the manager
        # (including the below methods) so len is used instead of count
        cities = obj.cities.all()

        for city in cities:
            beacons = city.yes_moment_beacons.all()
            count += len(beacons)
        return count

    def get_social_beacons(self, obj):
        count = 0

        cities = obj.cities.all()

        for city in cities:
            beacons = city.social_beacons.all()
            count += len(beacons)
        return count

    def get_video_beacons(self, obj):
        count = 0

        cities = obj.cities.all()

        for city in cities:
            beacons = city.video_beacons.all()
            count += len(beacons)
        return count

    class Meta:
        model = Region
        geo_field = 'center_coordinates'
        fields = ('name', 'yes_moment_beacons', 'video_beacons',
                  'social_beacons')


class WorldSerializer(GeoFeatureModelSerializer):
    center_coordinates = GeometrySerializerMethodField()

    regions = RegionInWorldSerializer(many=True, read_only=True)

    def get_center_coordinates(self, obj):
        return obj.center_coordinates

    class Meta:
        model = WorldBorder
        geo_field = 'mpoly'

        fields = ('name', 'iso2', 'center_coordinates', 'regions')

这是主要查询

def get_queryset(self):
    tolerance = self.request.GET.get('tolerance', None)
    if tolerance is not None:
        tolerance = float(tolerance)
        return WorldBorder.objects.simplified(tolerance)
    else:
        return WorldBorder.objects.all()

这是使用具有高容差的 ST_SimplifyVW 的 API 响应的一部分(236 个对象中的 1 个)。如果我不使用它,Firefox 会挂起,因为我认为它无法处理 10 MB 的 JSON。与其他国家相比,这个特定国家的边界​​数据很小。由于 ST_SimplifyVW,此处返回的 JSON 从 10MB 压缩到 750kb。即使只有 750KB 的 JSON,在我的本地机器上也需要 4.5 秒。

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "coordinates": [
          [
            [
              [
                74.915741,
                37.237328
              ],
              [
                74.400543,
                37.138962
              ],
              [
                74.038315,
                36.814682
              ],
              [
                73.668304,
                36.909637
              ],
              [
                72.556641,
                36.821266
              ],
              [
                71.581131,
                36.346443
              ],
              [
                71.18779,
                36.039444
              ],
              [
                71.647766,
                35.419991
              ],
              [
                71.496094,
                34.959435
              ],
              [
                70.978592,
                34.504997
              ],
              [
                71.077209,
                34.052216
              ],
              [
                70.472214,
                33.944153
              ],
              [
                70.002777,
                34.052773
              ],
              [
                70.323318,
                33.327774
              ],
              [
                69.561096,
                33.08194
              ],
              [
                69.287491,
                32.526382
              ],
              [
                69.328247,
                31.940365
              ],
              [
                69.013885,
                31.648884
              ],
              [
                68.161102,
                31.830276
              ],
              [
                67.575546,
                31.53194
              ],
              [
                67.778046,
                31.332218
              ],
              [
                66.727768,
                31.214996
              ],
              [
                66.395538,
                30.94083
              ],
              [
                66.256653,
                29.85194
              ],
              [
                65.034149,
                29.541107
              ],
              [
                64.059143,
                29.41444
              ],
              [
                63.587212,
                29.503887
              ],
              [
                62.484436,
                29.406105
              ],
              [
                60.868599,
                29.863884
              ],
              [
                61.758331,
                30.790276
              ],
              [
                61.713608,
                31.383331
              ],
              [
                60.85305,
                31.494995
              ],
              [
                60.858887,
                32.217209
              ],
              [
                60.582497,
                33.066101
              ],
              [
                60.886383,
                33.557213
              ],
              [
                60.533882,
                33.635826
              ],
              [
                60.508331,
                34.140274
              ],
              [
                60.878876,
                34.319717
              ],
              [
                61.289162,
                35.626381
              ],
              [
                62.029716,
                35.448601
              ],
              [
                62.309158,
                35.141663
              ],
              [
                63.091934,
                35.432495
              ],
              [
                63.131378,
                35.865273
              ],
              [
                63.986107,
                36.038048
              ],
              [
                64.473877,
                36.255554
              ],
              [
                64.823044,
                37.138603
              ],
              [
                65.517487,
                37.247215
              ],
              [
                65.771927,
                37.537498
              ],
              [
                66.302765,
                37.323608
              ],
              [
                67.004166,
                37.38221
              ],
              [
                67.229431,
                37.191933
              ],
              [
                67.765823,
                37.215546
              ],
              [
                68.001389,
                36.936104
              ],
              [
                68.664154,
                37.274994
              ],
              [
                69.246643,
                37.094154
              ],
              [
                69.515823,
                37.580826
              ],
              [
                70.134995,
                37.529045
              ],
              [
                70.165543,
                37.871719
              ],
              [
                70.71138,
                38.409866
              ],
              [
                70.97998,
                38.470459
              ],
              [
                71.591934,
                37.902618
              ],
              [
                71.429428,
                37.075829
              ],
              [
                71.842758,
                36.692101
              ],
              [
                72.658508,
                37.021202
              ],
              [
                73.307205,
                37.462753
              ],
              [
                73.819717,
                37.228058
              ],
              [
                74.247208,
                37.409546
              ],
              [
                74.915741,
                37.237328
              ]
            ]
          ]
        ],
        "type": "MultiPolygon"
      },
      "properties": {
        "name": "Afghanistan",
        "iso2": "AF",
        "center_coordinates": {
          "coordinates": [
            65.216,
            33.677
          ],
          "type": "Point"
        },
        "regions": {
          "type": "FeatureCollection",
          "features": [
            {
              "type": "Feature",
              "geometry": {
                "coordinates": [
                  66.75292967820785,
                  34.52466146754814
                ],
                "type": "Point"
              },
              "properties": {
                "name": "Central Afghanistan",
                "yes_moment_beacons": 0,
                "video_beacons": 0,
                "social_beacons": 0
              }
            },
            {
              "type": "Feature",
              "geometry": {
                "coordinates": [
                  69.69726561529792,
                  35.96022296494905
                ],
                "type": "Point"
              },
              "properties": {
                "name": "Northern Highlands",
                "yes_moment_beacons": 0,
                "video_beacons": 0,
                "social_beacons": 0
              }
            },
            {
              "type": "Feature",
              "geometry": {
                "coordinates": [
                  63.89541422401191,
                  32.27442932956255
                ],
                "type": "Point"
              },
              "properties": {
                "name": "Southwestern Afghanistan",
                "yes_moment_beacons": 0,
                "video_beacons": 0,
                "social_beacons": 0
              }
            }
          ]
        }
      }
    },
    ........
}

所以这里的重点是,GeoDjango 没有我预期的那么快,还是性能数据符合预期?我可以做些什么来提高性能,同时仍然输出 GeoJSON,即不是 WKT。微调公差是唯一的办法吗?不过,我也可能会分离用于获取区域的端点。

最佳答案

由于您的地理数据不会经常更改,请尝试在预先计算的 geojson 中缓存所有地区/国家多边形。即,使用该国家所有地区的地理数据创建一个 /country/123.geojson API 调用或静态文件,可能会提前进行简化。

您的其他 API 调用应该只返回数字数据,没有地理多边形,将组合任务留给客户端。

关于python - 如何在序列化大量 GeoDjango 几何字段时进行性能优化?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48040545/

相关文章:

python - 使用\copy 命令将 TSV 批量复制到 postgres

Python运行系统命令然后退出...不会退出

python线程不起作用

python - 使用 Numpy 转换为 Web 墨卡托

r - 在R中查找相邻多边形(邻居)

python - 如何根据条件生成随机数

java - Jython 未获取 python 方法。为什么?

python - Cassandra "Unable to connect to any servers"通过 Django,而 cqlsh 可以工作

python - 模型表单在 django 中的外键上崩溃

r - st_crs(x) == st_crs(y) 的 st_intersects 错误不是 TRUE