python - Python 有标准的 PTS 阅读器或解析器吗？

我有以下文件:

version: 1
n_points:  68
{
55.866278 286.258077
54.784191 315.123248
62.148364 348.908294
83.264019 377.625584
102.690421 403.808995
125.495327 438.438668
140.698598 471.379089
158.435748 501.785631
184.471278 511.002579
225.857960 504.171628
264.555990 477.159805
298.168768 447.523374
332.502678 411.220089
350.641672 372.839985
355.004106 324.781552
349.265206 270.707703
338.314674 224.205227
33.431075 238.262266
42.204378 227.503948
53.939564 227.904931
68.298209 232.202002
82.271511 239.951519
129.480996 229.905585
157.960824 211.545631
189.465597 204.068108
220.288164 208.206246
249.905282 218.863196
110.089281 266.422557
108.368067 298.896910
105.018473 331.956957
102.889410 363.542719
101.713553 379.256535
114.636047 383.331785
129.543556 384.250352
140.033133 375.640569
152.523364 366.956846
60.326871 270.980865
67.198221 257.376350
92.335775 259.211865
102.394658 274.137548
86.227917 277.162353
68.397650 277.343621
165.340638 263.379230
173.385917 246.412765
198.024842 240.895985
223.488685 247.333206
207.218336 260.967007
184.619159 265.379884
122.903148 418.405102
114.539655 407.643816
123.642553 404.120397
136.821841 407.806210
149.926926 403.069590
196.680098 399.302500
221.946232 394.444167
203.262878 417.808844
164.318232 440.472370
145.915650 444.015386
136.436942 442.897031
125.273506 429.073840
124.666341 420.331816
130.710965 421.709666
141.438004 423.161457
155.870784 418.844649
213.410389 396.978046
155.870784 418.844649
141.438004 423.161457
130.710965 421.709666
}

文件扩展名是 .pts 。

这个文件有标准的阅读器吗？

我尝试读取它的代码(从某个 github 下载)是

landmark = np.loadtxt(image_landmarks_path)

失败了

{ValueError}could not convert string to float: 'version:'

这是有道理的。

我无法更改文件，想知道我是否必须编写自己的解析器或者这是某种标准吗？

最佳答案

它似乎是一个 2D 点云文件，我认为它被称为 Landmark PTS 格式，我能找到的最接近的 Python 引用是 3D-morphable face model-fitting library issue ，它引用了 sample file that matches yours .大多数 .pts 点云工具都希望使用 3D 文件，因此可能无法立即使用此工具。

所以不，似乎没有标准的阅读器；我最接近读取格式的库是 this GitHub repository ，但它有一个缺点:在手动将其解析为 Python 浮点值之前，它会将所有数据读入内存。

但是，格式非常简单(如引用的问题说明)，因此您只需使用 numpy.loadtxt() 即可读取数据。 ;简单的方法是将所有这些非数据行命名为注释:

def read_pts(filename):
    return np.loadtxt(filename, comments=("version:", "n_points:", "{", "}"))

或者，如果您不确定一堆此类文件的有效性，并且您想确保只读取有效文件，那么您可以对文件进行预处理以读取标题(包括点数和版本验证，允许 comments and image size info ):

from pathlib import Path
from typing import Union
import numpy as np

def read_pts(filename: Union[str, bytes, Path]) -> np.ndarray:
    """Read a .PTS landmarks file into a numpy array"""
    with open(filename, 'rb') as f:
        # process the PTS header for n_rows and version information
        rows = version = None
        for line in f:
            if line.startswith(b"//"):  # comment line, skip
                continue
            header, _, value = line.strip().partition(b':')
            if not value:
                if header != b'{':
                    raise ValueError("Not a valid pts file")
                if version != 1:
                    raise ValueError(f"Not a supported PTS version: {version}")
                break
            try:
                if header == b"n_points":
                    rows = int(value)
                elif header == b"version":
                    version = float(value)  # version: 1 or version: 1.0
                elif not header.startswith(b"image_size_"):
                    # returning the image_size_* data is left as an excercise
                    # for the reader.
                    raise ValueError
            except ValueError:
                raise ValueError("Not a valid pts file")

        # if there was no n_points line, make sure the closing } line
        # is not going to trip up the numpy reader by marking it as a comment
        points = np.loadtxt(f, max_rows=rows, comments="}")

    if rows is not None and len(points) < rows:
        raise ValueError(f"Failed to load all {rows} points")
    return points

除了提供完整的测试套件外，该功能已尽我所能，可用于生产。

这使用 n_points:行告诉np.loadtxt()要读取多少行，并将文件位置向前移动到刚刚超过 {开瓶器。它也会以 ValueError 退出如果没有 version: 1行存在或者是否有除 version: 1 以外的任何内容和 n_points: <int>在标题中。

两者都产生一个 68x2 的 float64 值矩阵，但应该能够处理任何维度的点。

回到那个 EOS 库引用，他们的 demo code to read the data手动解析行，也是通过首先将所有行读入内存。我还找到了 this Facebook Research PTS dataset loading code (对于每行 3 个值的 .pts 文件)，这和手动一样。

关于python - Python 有标准的 PTS 阅读器或解析器吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59591181/

python - Python 有标准的 PTS 阅读器或解析器吗？

上一篇：typescript - 在 onPress 中 React Native Typescript Formik 的 handleSubmit 类型

下一篇：oauth - 如何为 Yahoo 开发者帐户启用 mail-r 范围