MySQL 数据库应该使用什么数据类型来存储 2 个文本文件的代码。如果我打算稍后比较相似度。
这是一个在我的 Windows 计算机上运行的 MySQL 数据库。
您还可以推荐一个可以为我比较代码的 API 吗?
最佳答案
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 65,535. The effective maximum length of a VARCHAR is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.
...
Values in CHAR and VARCHAR columns are sorted and compared according to the character set collation assigned to the column.
因此,VARCHAR 与表内联存储,而 BLOB 和 TEXT 类型则存储在表外,数据库保存数据的位置。根据文本的长度,TEXT 可能被定义为 TINYTEXT、TEXT、MEDIUMTEXT 和 LONGTEXT,唯一的区别是它保存的最大数据量。
- TINYTEXT 256 字节
- 文本 65,535 字节
- MEDIUMTEXT 16,777,215 字节
- 长文本 4,294,967,295 字节
致compare the two strings存储在 TEXT(或任何其他字符串列)中,您可能需要使用 STRCMP(expr1,expr2)
STRCMP() returns 0 if the strings are the same, -1 if the first argument is smaller than the second according to the current sort order, and 1 otherwise.
如果您指定所需的比较输出,我可能会编辑答案。
编辑
要比较两个字符串并计算差异百分比,您可能需要使用similar_text
。如官方文档states :
This calculates the similarity between two strings as described in Programming Classics: Implementing the World's Best Algorithms by Oliver (ISBN 0-131-00413-1). Note that this implementation does not use a stack as in Oliver's pseudo code, but recursive calls which may or may not speed up the whole process. Note also that the complexity of this algorithm is O(N**3) where N is the length of the longest string.
关于php - MySQL 数据库应使用什么数据类型来存储 2 个代码文本文件。如果我打算稍后比较相似度,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34964736/