c# - 在 c# 中拆分/循环包含括号的行

标签 c# algorithm recursion

我正在使用以下行:

(SENT (VBP (HPP (HP Vem))(VB kan)(VBP (VB få)(PMP (PM ATP)))(MADP (MAD ?))))

我想做如下输出:

SENT -> VBP -> HPP -> HP
SENT -> VBP -> VB
SENT -> VBP -> VBP -> VB
SENT -> VBP -> VBP -> PMP -> PM
SENT -> VBP -> MADP -> MAD

为了实现这一点,我首先想到循环遍历每个括号,从最外面开始,然后越来越深(如果有的话)。 (也许是递归函数?)

但由于实际上并没有用括号拆分的函数,我尝试用 ( 进行拆分,然后在循环时查找 ),如下所示:

    var row = "(SENT (VBP (HPP (HP Vem))(VB kan)(VBP (VB få)(PMP (PM ATP)))(MADP (MAD ?))))";

    string[] splitP = row.Split('(');

    for (int i = 0; i < splitP.Length; i++ )
    {
        string data = splitP[i];

        // string[] dataSplit = data.Split(')');

        Console.WriteLine(data);
    }

    Console.ReadLine();

但如您所见,我卡住了,上面的内容甚至不代表我试图归档的内容 - 因为我发现我的想法是错误的,而且不能那样做。

我怎样才能实现这个?

更新。

一个更大的测试线:

(SENT (VBP (PPP (PP På)(NNP (NN grundval))(PPP (PP av))(NNP (DTP (DT en))(NN intervju)(PPP (PP efter)(NNP (NN experimentet)))(PPP (PP med)(PCP (DTP (DT de))(PC oinvigda)(VBP (HPP (HP som))(VB gjort)(NNP (JJP (JJ felaktiga))(NN bedömningar)))))))(VB kunde)(PNP (PN man))(VBP (VB dela)(PLP (PL in))(PNP (PN dem))(PPP (PP i)(NNP (RGP (RG tre))(NN grupper)(MIDP (MID :))(KNP (NNP (NN (a)))(PNP (PN de)(VBP (HPP (HP som))(ABP (AB faktiskt))(VB trodde)(SNP (SN att)(VBP (PNP (PN de))(VB bedömt)(ABP (AB riktigt))))))(MIDP (MID ,))(PNP (NNP (NN (b)))(PN de)(VBP (HPP (HP som))(VB trodde)(SNP (SN att)(VBP (DTP (DT de)(JJP (JJ själva)))(VB måste)(VBP (VB ha)(VBP (VB misstagit)(PNP (PN sig))(SNP (SN eftersom)(VBP (ABP (AB inte))(PNP (ABP (AB så))(PN många))(VB kan)(VBP (VB ha)(ABP (AB fel))(PPP (PP mot)(NNP (DTP (DT en))(JJP (JJ enda))(NN person))))))))))))(KN och)(PNP (NNP (NN (c)))(PN de)(KNP (VBP (HPP (HP som))(ABP (AB faktiskt))(VB var)(JJP (JJ medvetna))(PPP (PP om)(SNP (SN att)(VBP (PNP (PN de))(VB angav)(NNP (JJP (JJ felaktiga))(NN bedömningar))))))(KN men)(VBP (HPP (HP som))(ABP (AB inte))(VB ville)(VBP (VB avvika)(PPP (PP från)(NNP (NN gruppen)))))))))))(MADP (MAD .))))

最佳答案

这里是另外一个答案,我尽量让它通俗易懂:

public class Class1
{
    public static void Main()
    {
        new Class1().myRec("(SENT (VBP (HPP (HP Vem))(VB kan)(VBP (VB få)(PMP (PM ATP)))(MADP (MAD ?))))", null);
    }


    public void myRec(string input, string start)
    {
        if (input == null)
            return;
        if (input[0] != '(' || input[input.Length - 1] != ')')
        {
            Console.WriteLine(start);
            return;
        }
        int count = 0;
        List<string> subStrs = new List<string>();

        input = input.Remove(0, 1);
        input = input.Remove(input.Length - 1, 1);
        int i = input.IndexOf(' ');

        string nextInput = i>0?input.Substring(0, i):input;

        if (start != null)
            start = start + " -> " + nextInput;
        else
            start = nextInput;

        input = input.Remove(0, i + 1);

        string tempStr = "";
        for (int j = 0; j < input.Length; j++)
        {
            tempStr += input[j];
            if (input[j] == '(')
                count++;
            else if (input[j] == ')')
            {
                count--;
                if (count == 0)
                {
                    subStrs.Add(tempStr);
                    tempStr = "";
                }
            }
        }
        if (subStrs.Count == 0)
            subStrs.Add(tempStr);

        subStrs.ForEach(delegate(string it)
        {
            new Class1().myRec(it, start);
        });

    }
}

它使用递归,而且只有当你的输入正确时它才有效,我的意思是你有相等的()。另外,我不是 C# 程序员,所以我知道这段代码可以改进很多。

编辑用列表替换数组,使代码更准确。

编辑 2 使其适用于可能不包含一些空格的输入,例如 OP 的新的更大的测试用例我做了一些更改:

在我的代码中替换它:

        if (start != null)
            start = start + " -> " + input.Substring(0, i);
        else
            start = input.Substring(0, i);

用这个:

    string nextInput = i>0?input.Substring(0, i):input;

    if (start != null)
        start = start + " -> " + nextInput;
    else
        start = nextInput;

(我已经做了)

这是输出:

SENT -> VBP -> PPP -> PP
SENT -> VBP -> PPP -> NNP -> NN
SENT -> VBP -> PPP -> PPP -> PP
SENT -> VBP -> PPP -> NNP -> DTP -> DT
SENT -> VBP -> PPP -> NNP -> NN
SENT -> VBP -> PPP -> NNP -> PPP -> PP
SENT -> VBP -> PPP -> NNP -> PPP -> NNP -> NN
SENT -> VBP -> PPP -> NNP -> PPP -> PP
SENT -> VBP -> PPP -> NNP -> PPP -> PCP -> DTP -> DT
SENT -> VBP -> PPP -> NNP -> PPP -> PCP -> PC
SENT -> VBP -> PPP -> NNP -> PPP -> PCP -> VBP -> HPP -> HP
SENT -> VBP -> PPP -> NNP -> PPP -> PCP -> VBP -> VB
SENT -> VBP -> PPP -> NNP -> PPP -> PCP -> VBP -> NNP -> JJP -> JJ
SENT -> VBP -> PPP -> NNP -> PPP -> PCP -> VBP -> NNP -> NN
SENT -> VBP -> VB
SENT -> VBP -> PNP -> PN
SENT -> VBP -> VBP -> VB
SENT -> VBP -> VBP -> PLP -> PL
SENT -> VBP -> VBP -> PNP -> PN
SENT -> VBP -> VBP -> PPP -> PP
SENT -> VBP -> VBP -> PPP -> NNP -> RGP -> RG
SENT -> VBP -> VBP -> PPP -> NNP -> NN
SENT -> VBP -> VBP -> PPP -> NNP -> MIDP -> MID
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> NNP -> NN -> a
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> PN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> HPP -> HP
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> ABP -> AB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> SN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> PNP -> PN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> ABP -> AB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> MIDP -> MID
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> NNP -> NN -> b
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> PN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> HPP -> HP
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> SN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> DTP -> DT
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> DTP -> JJP -> JJ
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> PNP -> PN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> SNP -> SN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> SNP -> VBP -> ABP -> AB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> SNP -> VBP -> PNP -> ABP -> AB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> SNP -> VBP -> PNP -> PN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> SNP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> SNP -> VBP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> SNP -> VBP -> VBP -> ABP -> AB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> SNP -> VBP -> VBP -> PPP -> PP
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> SNP -> VBP -> VBP -> PPP -> NNP -> DTP -> DT
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> SNP -> VBP -> VBP -> PPP -> NNP -> JJP -> JJ
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> VBP -> SNP -> VBP -> VBP -> VBP -> SNP -> VBP -> VBP -> PPP -> NNP -> NN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> KN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> NNP -> NN -> c
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> PN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> HPP -> HP
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> ABP -> AB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> JJP -> JJ
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> PPP -> PP
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> PPP -> SNP -> SN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> PPP -> SNP -> VBP -> PNP -> PN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> PPP -> SNP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> PPP -> SNP -> VBP -> NNP -> JJP -> JJ
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> PPP -> SNP -> VBP -> NNP -> NN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> KN
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> HPP -> HP
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> ABP -> AB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> VBP -> VB
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> VBP -> PPP -> PP
SENT -> VBP -> VBP -> PPP -> NNP -> KNP -> PNP -> KNP -> VBP -> VBP -> PPP -> NNP -> NN
SENT -> VBP -> MADP -> MAD

关于c# - 在 c# 中拆分/循环包含括号的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27686014/

相关文章:

c# - Xamarin 表格 : Bind byte array to ImageCell ImageSource in ListView

c# - Mapper.Map<Task<IEnumerable<Address>>, Task<IEnumerable<AddressView>>()

c# - MVVM 模式和半全局数据

c - 仅打印那些总和为 10 的 3 位数组 - C 程序

c# - Application Insights 不记录性能计数器数据

java - 欧拉计划 35 : HashSet gives incorrect results

c++ - 为什么整数背包的容量为1?

c - 从数组中省略 'N' 组元素

c++ - 在不使用 for 或 while 的情况下在排序数组中查找元素

python - 聪明的基于流的 python 程序不会遇到无限递归