regex - 设计 DFA 接受可被数字 'n' 整除的二进制字符串

标签 regex automata dfa

我需要学习如何设计一个 DFA,以便给定任何数字“n”,它接受二进制字符串 {0, 1},其十进制等效数可被“n”整除。

不同的“n”会有不同的 DFA,但有人可以给出一个基本方法,我应该遵循该方法来处理任何数字 0 < n < 10 。

最佳答案

下面,我写了 n 等于 5 的答案,但您可以应用相同的方法为 n 的任何值和“任何位置数字系统”(例如二进制、三元)绘制 DFA...

首先了解一下“完全DFA”这个词,δ:Q × Σ→Q 中的DFA defined on complete domain 称为“完全DFA”。换句话说,我们可以说;在完整 DFA 的转换图中,不存在缺失边(例如,从 Q 中的每个状态来看,Σ 中的每个语言符号都存在一个传出边)。注意:有时我们将 partial DFA 定义为 δ ⊆ Q × Σ→Q (阅读:How does “δ:Q × Σ→Q” read in the definition of a DFA )。

设计 DFA 接受可被数字“n”整除的二进制数:

第 1 步:当您将数字 ω 除以 n 时,提醒可以是 0、1、...、(n - 2) 或 (n - 1)。如果余数是 0 则意味着 ω 可以被 n 整除,否则不能。因此,在我的 DFA 中,将有一个状态 qr ,它对应于余数值 r ,其中 0 <= r <= (n - 1) ,DFA 中的状态总数为 n
在 Σ 上处理数字串 ω 后,最终状态为 qr 意味着 ω % n => r(% 提醒运算符)。

在任何自动机中,状态的目的就像内存元素。 Atomata 中的状态存储一些信息,例如风扇开关,可以判断风扇是处于“关闭”状态还是“打开”状态。对于n=5,DFA中的5种状态对应5条提醒信息如下:

  1. 如果提醒为 0,则达到状态 q0。状态 q0 是最终状态(接受状态)。这也是一个初始状态。
  2. 如果提醒为 1(非最终状态),则达到状态 q1
  3. 如果提醒为 2(非最终状态),则状态 q2
  4. 如果提醒为 3(非最终状态),则状态 q3
  5. 如果提醒为 4(非最终状态),则状态 q4

利用以上信息,我们可以开始绘制五种状态的转移图TD,如下:

fig-1
Figure-1

因此,5 个状态代表 5 个余值。处理字符串 ω 后,如果最终状态变为 q0,则表示输入字符串的十进制等值可被 5 整除。在上图中,q0 被标记为两个同心圆的最终状态圆圈。
此外,我还定义了一个转换规则 δ:(q0, 0)→q0 作为符号 '0' 在状态 q0 的自循环>,这是因为任何仅由 '0' 组成的字符串的十进制等效值都是 0,并且 0 可以被 n 整除。

第2步:上面的TD不完整;并且只能处理 '0' 的字符串。现在添加更多边,以便它可以处理后续数字的字符串。检查下表,显示下一步可以添加的新转换规则:

┌──────┬──────┬─────────────┬─────────┐
│NumberBinaryRemainder(%5)End-state│
├──────┼──────┼─────────────┼─────────┤
│One   │1     │1            │q1       │
├──────┼──────┼─────────────┼─────────┤
│Two   │10    │2            │q2       │
├──────┼──────┼─────────────┼─────────┤
│Three │11    │3            │q3       │
├──────┼──────┼─────────────┼─────────┤
│Four  │100   │4            │q4       │
└──────┴──────┴─────────────┴─────────┘
  1. To process binary string '1' there should be a transition rule δ:(q0, 1)→q1
  2. Two:- binary representation is '10', end-state should be q2, and to process '10', we just need to add one more transition rule δ:(q1, 0)→q2
    Path: →(q0)─1→(q1)─0→(q2)
  3. Three:- in binary it is '11', end-state is q3, and we need to add a transition rule δ:(q1, 1)→q3
    Path: →(q0)─1→(q1)─1→(q3)
  4. Four:- in binary '100', end-state is q4. TD already processes prefix string '10' and we just need to add a new transition rule δ:(q2, 0)→q4
    Path: →(q0)─1→(q1)─0→(q2)─0→(q4)

fig-2 Figure-2

Step-3: Five = 101
Above transition diagram in figure-2 is still incomplete and there are many missing edges, for an example no transition is defined for δ:(q2, 1)-?. And the rule should be present to process strings like '101'.
Because '101' = 5 is divisible by 5, and to accept '101' I will add δ:(q2, 1)→q0 in above figure-2.
Path: →(q0)─1→(q1)─0→(q2)─1→(q0)
with this new rule, transition diagram becomes as follows:

fig-3 Figure-3

Below in each step I pick next subsequent binary number to add a missing edge until I get TD as a 'complete DFA'.

Step-4: Six = 110.

We can process '11' in present TD in figure-3 as: →(q0)─11→(q3) ─0→(?). Because 6 % 5 = 1 this means to add one rule δ:(q3, 0)→q1.

fig-4 Figure-4

Step-5: Seven = 111

┌──────┬──────┬─────────────┬─────────┬────────────┬───────────┐
│NumberBinaryRemainder(%5)End-state PathAdd       │
├──────┼──────┼─────────────┼─────────┼────────────┼───────────┤
│Seven │111   │7 % 5 = 2    │q2       │ q0─11→q3    q3─1→q2    │
└──────┴──────┴─────────────┴─────────┴────────────┴───────────┘

fig-5 Figure-5

Step-6: Eight = 1000

┌──────┬──────┬─────────────┬─────────┬──────────┬─────────┐
│NumberBinaryRemainder(%5)End-state PathAdd     │
├──────┼──────┼─────────────┼─────────┼──────────┼─────────┤
│Eight │1000  │8 % 5 = 3    │q3       │q0─100→q4 │ q4─0→q3  │
└──────┴──────┴─────────────┴─────────┴──────────┴─────────┘

fig-6 Figure-6

Step-7: Nine = 1001

┌──────┬──────┬─────────────┬─────────┬──────────┬─────────┐
│NumberBinaryRemainder(%5)End-state PathAdd     │
├──────┼──────┼─────────────┼─────────┼──────────┼─────────┤
│Nine  │1001  │9 % 5 = 4    │q4       │q0─100→q4 │ q4─1→q4  │
└──────┴──────┴─────────────┴─────────┴──────────┴─────────┘

fig-7 Figure-7

In TD-7, total number of edges are 10 == Q × Σ = 5 × 2. And it is a complete DFA that can accept all possible binary strings those decimal equivalent is divisible by 5.

Design DFA accepting Ternary numbers divisible by number n:

Step-1 Exactly same as for binary, use figure-1.

Step-2 Add Zero, One, Two

┌───────┬───────┬─────────────┬─────────┬──────────────┐
│DecimalTernaryRemainder(%5)End-state   Add        │
├───────┼───────┼─────────────┼─────────┼──────────────┤
│Zero   │0      │0            │q0       │ δ:(q0,0)→q0  │
├───────┼───────┼─────────────┼─────────┼──────────────┤
│One    │1      │1            │q1       │ δ:(q0,1)→q1  │
├───────┼───────┼─────────────┼─────────┼──────────────┤
│Two    │2      │2            │q2       │ δ:(q0,2)→q3  │
└───────┴───────┴─────────────┴─────────┴──────────────┘

fig-8
Figure-8

Step-3 Add Three, Four, Five

┌───────┬───────┬─────────────┬─────────┬─────────────┐
│DecimalTernaryRemainder(%5)End-stateAdd        │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Three  │10     │3            │q3       │ δ:(q1,0)→q3 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Four   │11     │4            │q4       │ δ:(q1,1)→q4 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Five   │12     │0            │q0       │ δ:(q1,2)→q0 │
└───────┴───────┴─────────────┴─────────┴─────────────┘

fig-9
Figure-9

Step-4 Add Six, Seven, Eight

┌───────┬───────┬─────────────┬─────────┬─────────────┐
│DecimalTernaryRemainder(%5)End-stateAdd        │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Six    │20     │1            │q1       │ δ:(q2,0)→q1 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Seven  │21     │2            │q2       │ δ:(q2,1)→q2 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Eight  │22     │3            │q3       │ δ:(q2,2)→q3 │
└───────┴───────┴─────────────┴─────────┴─────────────┘

fig-10
Figure-10

Step-5 Add Nine, Ten, Eleven

┌───────┬───────┬─────────────┬─────────┬─────────────┐
│DecimalTernaryRemainder(%5)End-stateAdd        │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Nine   │100    │4            │q4       │ δ:(q3,0)→q4 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Ten    │101    │0            │q0       │ δ:(q3,1)→q0 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Eleven │102    │1            │q1       │ δ:(q3,2)→q1 │
└───────┴───────┴─────────────┴─────────┴─────────────┘

fig-11
Figure-11

Step-6 Add Twelve, Thirteen, Fourteen

┌────────┬───────┬─────────────┬─────────┬─────────────┐
│DecimalTernaryRemainder(%5)End-stateAdd        │
├────────┼───────┼─────────────┼─────────┼─────────────┤
│Twelve  │110    │2            │q2       │ δ:(q4,0)→q2 │
├────────┼───────┼─────────────┼─────────┼─────────────┤
│Thirteen│111    │3            │q3       │ δ:(q4,1)→q3 │
├────────┼───────┼─────────────┼─────────┼─────────────┤
│Fourteen│112    │4            │q4       │ δ:(q4,2)→q4 │
└────────┴───────┴─────────────┴─────────┴─────────────┘

fig-12
Figure-12

Total number of edges in transition diagram figure-12 are 15 = Q × Σ = 5 * 3 (a complete DFA). And this DFA can accept all strings consist over {0, 1, 2} those decimal equivalent is divisible by 5.
If you notice at each step, in table there are three entries because at each step I add all possible outgoing edge from a state to make a complete DFA (and I add an edge so that qr state gets for remainder is r)!

To add further, remember union of two regular languages are also a regular. If you need to design a DFA that accepts binary strings those decimal equivalent is either divisible by 3 or 5, then draw two separate DFAs for divisible by 3 and 5 then union both DFAs to construct target DFA (for 1 <= n <= 10 your have to union 10 DFAs).

If you are asked to draw DFA that accepts binary strings such that decimal equivalent is divisible by 5 and 3 both then you are looking for DFA of divisible by 15 ( but what about 6 and 8?).

Note: DFAs drawn with this technique will be minimized DFA only when there is no common factor between number n and base e.g. there is no between 5 and 2 in first example, or between 5 and 3 in second example, hence both DFAs constructed above are minimized DFAs. If you are interested to read further about possible mini states for number n and base b read paper: Divisibility and State Complexity.

below I have added a Python script, I written it for fun while learning Python library pygraphviz. I am adding it I hope it can be helpful for someone in someway.

Design DFA for base 'b' number strings divisible by number 'n':

So we can apply above trick to draw DFA to recognize number strings in any base 'b' those are divisible a given number 'n'. In that DFA total number of states will be n (for n remainders) and number of edges should be equal to 'b' * 'n' — that is complete DFA: 'b' = number of symbols in language of DFA and 'n' = number of states.

Using above trick, below I have written a Python Script to Draw DFA for input base and number. In script, function divided_by_N populates DFA's transition rules in base * number steps. In each step-num, I convert num into number string num_s using function baseN(). To avoid processing each number string, I have used a temporary data-structure lookup_table. In each step, end-state for number string num_s is evaluated and stored in lookup_table to use in next step.

For transition graph of DFA, I have written a function draw_transition_graph using Pygraphviz library (very easy to use). To use this script you need to install graphviz. To add colorful edges in transition diagram, I randomly generates color codes for each symbol get_color_dict function.

#!/usr/bin/env python
import pygraphviz as pgv
from pprint import pprint
from random import choice as rchoice

def baseN(n, b, syms="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"):
    """ converts a number `n` into base `b` string """
    return ((n == 0) and syms[0]) or (
        baseN(n//b, b, syms).lstrip(syms[0]) + syms[n % b])

def divided_by_N(number, base):
    """
    constructs DFA that accepts given `base` number strings
    those are divisible by a given `number`
    """
    ACCEPTING_STATE = START_STATE = '0'
    SYMBOL_0 = '0'
    dfa = {
        str(from_state): {
            str(symbol): 'to_state' for symbol in range(base)
        }
        for from_state in range(number)
    }
    dfa[START_STATE][SYMBOL_0] = ACCEPTING_STATE
    # `lookup_table` keeps track: 'number string' -->[dfa]--> 'end_state'
    lookup_table = { SYMBOL_0: ACCEPTING_STATE }.setdefault
    for num in range(number * base):
        end_state = str(num % number)
        num_s = baseN(num, base)
        before_end_state = lookup_table(num_s[:-1], START_STATE)
        dfa[before_end_state][num_s[-1]] = end_state
        lookup_table(num_s, end_state)
    return dfa

def symcolrhexcodes(symbols):
    """
    returns dict of color codes mapped with alphabets symbol in symbols
    """
    return {
        symbol: '#'+''.join([
            rchoice("8A6C2B590D1F4E37") for _ in "FFFFFF"
        ])
        for symbol in symbols
    }

def draw_transition_graph(dfa, filename="filename"):
    ACCEPTING_STATE = START_STATE = '0'
    colors = symcolrhexcodes(dfa[START_STATE].keys())
    # draw transition graph
    tg = pgv.AGraph(strict=False, directed=True, decorate=True)
    for from_state in dfa:
        for symbol, to_state in dfa[from_state].iteritems():
            tg.add_edge("Q%s"%from_state, "Q%s"%to_state,
                        label=symbol, color=colors[symbol],
                        fontcolor=colors[symbol])

    # add intial edge from an invisible node!
    tg.add_node('null', shape='plaintext', label='start')
    tg.add_edge('null', "Q%s"%START_STATE,)

    # make end acception state as 'doublecircle'
    tg.get_node("Q%s"%ACCEPTING_STATE).attr['shape'] = 'doublecircle'
    tg.draw(filename, prog='circo')
    tg.close()

def print_transition_table(dfa):
    print("DFA accepting number string in base '%(base)s' "
            "those are divisible by '%(number)s':" % {
                'base': len(dfa['0']),
                'number': len(dfa),})
    pprint(dfa)

if __name__ == "__main__":
    number = input ("Enter NUMBER: ")
    base = input ("Enter BASE of number system: ")
    dfa = divided_by_N(number, base)

    print_transition_table(dfa)
    draw_transition_graph(dfa)

执行:

~/study/divide-5/script$ python script.py 
Enter NUMBER: 5
Enter BASE of number system: 4
DFA accepting number string in base '4' those are divisible by '5':
{'0': {'0': '0', '1': '1', '2': '2', '3': '3'},
 '1': {'0': '4', '1': '0', '2': '1', '3': '2'},
 '2': {'0': '3', '1': '4', '2': '0', '3': '1'},
 '3': {'0': '2', '1': '3', '2': '4', '3': '0'},
 '4': {'0': '1', '1': '2', '2': '3', '3': '4'}}
~/study/divide-5/script$ ls
script.py filename.png
~/study/divide-5/script$ display filename

输出:

base_4_divided_5_best
DFA accepting number strings in base 4 those are divisible by 5

同样,输入base=4和number=7生成-dfa accepting number string in base '4' those are divisible by '7'
顺便说一句,尝试将 filename 更改为 .png.jpeg

<子> 引用我用来编写此脚本的内容:
"convert integer to a string in a given numeric base in python" 中的函数 baseN
➋ 安装“pygraphviz”:"Python does not see pygraphviz"
➌ 学习 Pygraphviz 的使用:"Python-FSM"
➍ 为每个语言符号生成随机的十六进制颜色代码:"How would I make a random hexdigit code generator using .join and for loops?"

关于regex - 设计 DFA 接受可被数字 'n' 整除的二进制字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21897554/

相关文章:

state - 将 Epsilon-NFA 转换为 NFA

automata - 设计接受可被 7 整除的十进制字符串的 DFA

compiler-construction - nfa 与 dfa 的时间复杂度权衡

regex - 将调节表达式转换为 DFA

python - 正则表达式 GUI?

c - C 中的正则表达式不起作用(我找不到原因)

javascript - 带字符转义的字符范围

regex - 常规语言(是或否)

regex - 用 Notepad++ 替换前导空格

regular-language - 为给定的正则表达式绘制最小 DFA