perl - 使用 Perl 确定非重叠位置

我有一个位置集合 - 这是数据结构的示例。

my $locations =
{
  loc_1 =>
  {
    start => 1,
    end   => 193,
  },
  loc_2 =>
  {
    start => 180,
    end   => 407,
  },
  loc_3 =>
  {
    start => 329,
    end   => 684,
  },
  loc_4 =>
  {
    start => 651,
    end   => 720,
  },
};

确定非重叠位置的每种可能组合的最佳方法是什么？这个例子的答案看起来像这样。请记住，可能有一个或多个位置，并且这些位置可能重叠也可能不重叠。

my $non_overlapping_locations =
[
  {
    loc_1 =>
    {
      start => 1,
      end   => 193,
    },
    loc_3 =>
    {
      start => 329,
      end   => 684,
    },
  },
  {
    loc_1 =>
    {
      start => 1,
      end   => 193,
    },
    loc_4 =>
    {
      start => 651,
      end   => 720,
    },
  },
  {
    loc_2 =>
    {
      start => 180,
      end   => 407,
    },
    loc_4 =>
    {
      start => 651,
      end   => 720,
    },
  }
];

更新:ysth的回复帮助我发现了措辞中的缺陷。我想我对//所有可能的//非重叠位置的组合不感兴趣，我只对不属于其他解决方案子集的解决方案感兴趣。

最佳答案

use strict;
use warnings;

my $locations =
{
  loc_1 =>
  {
    start => 1,
    end   => 193,
  },
  loc_2 =>
  {
    start => 180,
    end   => 407,
  },
  loc_3 =>
  {
    start => 329,
    end   => 684,
  },
  loc_4 =>
  {
    start => 651,
    end   => 720,
  },
};
my $non_overlapping_locations = [];
my @locations = sort keys %$locations;

get_location_combinations( $locations, $non_overlapping_locations, [], @locations );

use Data::Dumper;
print Data::Dumper::Dumper($non_overlapping_locations);

sub get_location_combinations {
    my ($locations, $results, $current, @remaining) = @_;

    if ( ! @remaining ) {
        if ( not_a_subset_combination( $results, $current ) ) {
            push @$results, $current;
        }
    }
    else {
        my $next = shift @remaining;
        if (can_add_location( $locations, $current, $next )) {
            get_location_combinations( $locations, $results, [ @$current, $next ], @remaining );
        }
        get_location_combinations( $locations, $results, [ @$current ], @remaining );
    }
}

sub can_add_location {
    my ($locations, $current, $candidate) = @_;

    # not clear if == is an overlap; modify to use >=  and <= if so.
    0 == grep $locations->{$candidate}{end} > $locations->{$_}{start} && $locations->{$candidate}{start} < $locations->{$_}{end}, @$current;
}

sub not_a_subset_combination {
    my ($combinations, $candidate) = @_;

    for my $existing (@$combinations) {
        my %candidate;
        @candidate{@$candidate} = ();
        delete @candidate{@$existing};
        if ( 0 == keys %candidate ) {
            return 0;
        }
    }
    return 1;
}

一个相对简单的优化是按开始和结束对 @locations 进行排序，并预先计算并存储在每个位置的哈希值中(或仅存储在 $locations->{foo} 中)，以下位置中有多少个与以下位置发生冲突那个位置。然后在 can_add... 的情况下，在递归之前将该数字与 @remaining 拼接起来。

或者为每个位置预先计算所有后续冲突位置的哈希值，并在递归之前使用 grep 将它们全部删除。 (尽管采用这种方法，保留哈希值开始变得更有意义。)

更新:解决方案的另一种方法是构建要排除的位置树，其中叶子代表解决方案，内部节点代表仍然存在冲突的组合；顶部节点是所有位置，每个节点都有子节点，这些子节点表示删除剩余冲突位置之一，该位置(在某种任意排序方案中)大于父节点(如果有)删除的位置。

关于perl - 使用 Perl 确定非重叠位置，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/4627618/

perl - 使用 Perl 确定非重叠位置

上一篇：hibernate - 我如何使用 JPA/Hibernate 检索实体列表 + 附加值？

下一篇：.net - 如何在 Hudson-CI 上运行 MsTest 测试？