我的目标是读取 XML 文本文件并将每个单词和标签拆分到数组中自己的行中。
例如,如果我将此文本输入到我的程序中:
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
我会得到这个:
<note>
<to>
Tove
</to>
<from>
...
现在我的代码可以成功地做到这一点,但只能使用单词“so”,而不是上面的列表:
note
to
Tove
...
我想保留标签,否则我将无法用它做我想做的事情。所以我一直试图让它也添加标签,但一直失败
好的,这是我的代码:
//While the file is not empty
while(fgets(buffer, sizeof(buffer), stdin) != NULL){
int first = 0;
int last = 0;
//While words are left in line
while(last < INITIAL_SIZE && buffer[last] != '\0'){
int bool = 0;
//Tag detected
if(buffer[last] == '<'){
while(buffer[last] != '>'){
last++;
}
bool = 1;
}else{
//While more chars are in the word
while(last < INITIAL_SIZE && isalpha(buffer[last])){
last++;
}
}
//Word detected
if(first < last){
//Words array is full, add more space
if(numOfWords == sizeOfWords){
sizeOfWords = sizeOfWords + 10;
words = (char **) realloc(words, sizeOfWords*sizeof(char *));
}
//Allocate memory for array
words[numOfWords] = (char *) calloc(last-first+1, sizeof(char));
for(i = 0; i < (last-first); i++){
words[numOfWords][i] = buffer[first + i];
}
//Add terminator to "new word"
words[numOfWords][i] = '\0';
numOfWords++;
}
//Move "Array Pointers" accordingly
last++;
first = last;
}
}
任何人都有任何想法,上面的代码是打印输出:
<note
<to
Tove
to
<from
Jani
from
<heading
...
Don
t
forget
me
this
weekend
</body
</note
那么,在这面文字墙之后,有人知道如何修改我当前的代码以使其正常工作吗?或者还有其他人有替代方案吗?
最佳答案
我的基本思维方式是这样的:
first
是当前单词中包含的第一个字母;
last
是当前单词中未包含的第一个字母。
在您的程序中,当您检测标签时,您不会包含 >
。另外,最后的 last++
是不需要的,因为你正确地解析了单词,一旦包含 >
,它就没用了。此外,您不仅忘记检查 \0
作为字符串的结尾,还忘记检查 \n
作为行的结尾。
这是我的解决方案:
while (fgets(buffer, sizeof(buffer), stdin) != NULL) {
int first = 0;
int last = 0;
//While words are left in line
while (last < INITIAL_SIZE && buffer[last] != '\0'
&& buffer[last] != '\n') { // <--------- Add this
int Bool = 0;
//Tag detected
if (buffer[last] == '<') {
while (buffer[last] != '>') {
last++;
}
last++; // <--------- This
Bool = 1;
} else {
//While more chars are in the word
while (last < INITIAL_SIZE && isalpha(buffer[last])) {
last++;
}
}
//Word detected
if (first < last) {
//Words array is full, add more space
if (numOfWords == sizeOfWords) {
sizeOfWords = sizeOfWords + 10;
words = (char **) realloc(words,
sizeOfWords * sizeof(char *));
}
//Allocate memory for array
words[numOfWords] = (char *) calloc(last - first + 1,
sizeof(char));
for (i = 0; i < (last - first); i++) {
words[numOfWords][i] = buffer[first + i];
}
//Add terminator to "new word"
words[numOfWords][i] = '\0';
numOfWords++;
}
//Move "Array Pointers" accordingly
first = last; // <--------- And change this
}
}
关于c - 用 C 分解 xml 文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6785760/