regex - 设计 DFA 接受可被数字 'n' 整除的二进制字符串

我需要学习如何设计一个 DFA，以便给定任何数字“n”，它接受二进制字符串 {0, 1}，其十进制等效数可被“n”整除。

不同的“n”会有不同的 DFA，但有人可以给出一个基本方法，我应该遵循该方法来处理任何数字 0 < n < 10 。

最佳答案

下面，我写了 n 等于 5 的答案，但您可以应用相同的方法为 n 的任何值和“任何位置数字系统”(例如二进制、三元)绘制 DFA...

首先了解一下“完全DFA”这个词，δ:Q × Σ→Q 中的DFA defined on complete domain 称为“完全DFA”。换句话说，我们可以说；在完整 DFA 的转换图中，不存在缺失边(例如，从 Q 中的每个状态来看，Σ 中的每个语言符号都存在一个传出边)。注意:有时我们将 partial DFA 定义为 δ ⊆ Q × Σ→Q (阅读:How does “δ:Q × Σ→Q” read in the definition of a DFA )。

设计 DFA 接受可被数字“n”整除的二进制数:

第 1 步:当您将数字 ω 除以 n 时，提醒可以是 0、1、...、(n - 2) 或 (n - 1)。如果余数是 0 则意味着 ω 可以被 n 整除，否则不能。因此，在我的 DFA 中，将有一个状态 q_r ，它对应于余数值 r ，其中 0 <= r <= (n - 1) ，DFA 中的状态总数为 n 。
在 Σ 上处理数字串 ω 后，最终状态为 q_r 意味着 ω % n => r(% 提醒运算符)。

在任何自动机中，状态的目的就像内存元素。 Atomata 中的状态存储一些信息，例如风扇开关，可以判断风扇是处于“关闭”状态还是“打开”状态。对于n=5，DFA中的5种状态对应5条提醒信息如下:

如果提醒为 0，则达到状态 q₀。状态 q₀ 是最终状态(接受状态)。这也是一个初始状态。
如果提醒为 1(非最终状态)，则达到状态 q₁。
如果提醒为 2(非最终状态)，则状态 q₂。
如果提醒为 3(非最终状态)，则状态 q₃。
如果提醒为 4(非最终状态)，则状态 q₄。

利用以上信息，我们可以开始绘制五种状态的转移图TD，如下:

Figure-1

因此，5 个状态代表 5 个余值。处理字符串 ω 后，如果最终状态变为 q₀，则表示输入字符串的十进制等值可被 5 整除。在上图中，q₀ 被标记为两个同心圆的最终状态圆圈。
此外，我还定义了一个转换规则 δ:(q₀, 0)→q₀ 作为符号 '0' 在状态 q₀ 的自循环>，这是因为任何仅由 '0' 组成的字符串的十进制等效值都是 0，并且 0 可以被 n 整除。

第2步:上面的TD不完整；并且只能处理 '0' 的字符串。现在添加更多边，以便它可以处理后续数字的字符串。检查下表，显示下一步可以添加的新转换规则:
┌──────┬──────┬─────────────┬─────────┐
│Number│Binary│Remainder(%5)│End-state│
├──────┼──────┼─────────────┼─────────┤
│One   │1     │1            │q₁       │
├──────┼──────┼─────────────┼─────────┤
│Two   │10    │2            │q₂       │
├──────┼──────┼─────────────┼─────────┤
│Three │11    │3            │q₃       │
├──────┼──────┼─────────────┼─────────┤
│Four  │100   │4            │q₄       │
└──────┴──────┴─────────────┴─────────┘
To process binary string '1' there should be a transition rule δ:(q₀, 1)→q₁

Two:- binary representation is '10', end-state should be q₂, and to process '10', we just need to add one more transition rule δ:(q₁, 0)→q₂
Path: →(q₀)─1→(q₁)─0→(q₂)

Three:- in binary it is '11', end-state is q₃, and we need to add a transition rule δ:(q₁, 1)→q₃
Path: →(q₀)─1→(q₁)─1→(q₃)

Four:- in binary '100', end-state is q₄. TD already processes prefix string '10' and we just need to add a new transition rule δ:(q₂, 0)→q₄
Path: →(q₀)─1→(q₁)─0→(q₂)─0→(q₄)

Figure-2

Step-3: Five = 101
Above transition diagram in figure-2 is still incomplete and there are many missing edges, for an example no transition is defined for δ:(q₂, 1)-?. And the rule should be present to process strings like '101'.
Because '101' = 5 is divisible by 5, and to accept '101' I will add δ:(q₂, 1)→q₀ in above figure-2.
Path: →(q₀)─1→(q₁)─0→(q₂)─1→(q₀)
with this new rule, transition diagram becomes as follows:

Figure-3

Below in each step I pick next subsequent binary number to add a missing edge until I get TD as a 'complete DFA'.

Step-4: Six = 110.

We can process '11' in present TD in figure-3 as: →(q₀)─11→(q₃) ─0→(?). Because 6 % 5 = 1 this means to add one rule δ:(q₃, 0)→q₁.

Figure-4

Step-5: Seven = 111
┌──────┬──────┬─────────────┬─────────┬────────────┬───────────┐
│Number│Binary│Remainder(%5)│End-state│ Path       │ Add       │
├──────┼──────┼─────────────┼─────────┼────────────┼───────────┤
│Seven │111   │7 % 5 = 2    │q₂       │ q₀─11→q₃   │ q₃─1→q₂    │
└──────┴──────┴─────────────┴─────────┴────────────┴───────────┘
Figure-5

Step-6: Eight = 1000
┌──────┬──────┬─────────────┬─────────┬──────────┬─────────┐
│Number│Binary│Remainder(%5)│End-state│ Path     │ Add     │
├──────┼──────┼─────────────┼─────────┼──────────┼─────────┤
│Eight │1000  │8 % 5 = 3    │q₃       │q₀─100→q₄ │ q₄─0→q₃  │
└──────┴──────┴─────────────┴─────────┴──────────┴─────────┘
Figure-6

Step-7: Nine = 1001
┌──────┬──────┬─────────────┬─────────┬──────────┬─────────┐
│Number│Binary│Remainder(%5)│End-state│ Path     │ Add     │
├──────┼──────┼─────────────┼─────────┼──────────┼─────────┤
│Nine  │1001  │9 % 5 = 4    │q₄       │q₀─100→q₄ │ q₄─1→q₄  │
└──────┴──────┴─────────────┴─────────┴──────────┴─────────┘
Figure-7

In TD-7, total number of edges are 10 == Q × Σ = 5 × 2. And it is a complete DFA that can accept all possible binary strings those decimal equivalent is divisible by 5.

Design DFA accepting Ternary numbers divisible by number n:

Step-1 Exactly same as for binary, use figure-1.

Step-2 Add Zero, One, Two
┌───────┬───────┬─────────────┬─────────┬──────────────┐
│Decimal│Ternary│Remainder(%5)│End-state│   Add        │
├───────┼───────┼─────────────┼─────────┼──────────────┤
│Zero   │0      │0            │q0       │ δ:(q0,0)→q0  │
├───────┼───────┼─────────────┼─────────┼──────────────┤
│One    │1      │1            │q1       │ δ:(q0,1)→q1  │
├───────┼───────┼─────────────┼─────────┼──────────────┤
│Two    │2      │2            │q2       │ δ:(q0,2)→q3  │
└───────┴───────┴─────────────┴─────────┴──────────────┘
Figure-8

Step-3 Add Three, Four, Five
┌───────┬───────┬─────────────┬─────────┬─────────────┐
│Decimal│Ternary│Remainder(%5)│End-state│  Add        │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Three  │10     │3            │q3       │ δ:(q1,0)→q3 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Four   │11     │4            │q4       │ δ:(q1,1)→q4 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Five   │12     │0            │q0       │ δ:(q1,2)→q0 │
└───────┴───────┴─────────────┴─────────┴─────────────┘
Figure-9

Step-4 Add Six, Seven, Eight
┌───────┬───────┬─────────────┬─────────┬─────────────┐
│Decimal│Ternary│Remainder(%5)│End-state│  Add        │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Six    │20     │1            │q1       │ δ:(q2,0)→q1 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Seven  │21     │2            │q2       │ δ:(q2,1)→q2 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Eight  │22     │3            │q3       │ δ:(q2,2)→q3 │
└───────┴───────┴─────────────┴─────────┴─────────────┘
Figure-10

Step-5 Add Nine, Ten, Eleven
┌───────┬───────┬─────────────┬─────────┬─────────────┐
│Decimal│Ternary│Remainder(%5)│End-state│  Add        │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Nine   │100    │4            │q4       │ δ:(q3,0)→q4 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Ten    │101    │0            │q0       │ δ:(q3,1)→q0 │
├───────┼───────┼─────────────┼─────────┼─────────────┤
│Eleven │102    │1            │q1       │ δ:(q3,2)→q1 │
└───────┴───────┴─────────────┴─────────┴─────────────┘
Figure-11

Step-6 Add Twelve, Thirteen, Fourteen
┌────────┬───────┬─────────────┬─────────┬─────────────┐
│Decimal │Ternary│Remainder(%5)│End-state│  Add        │
├────────┼───────┼─────────────┼─────────┼─────────────┤
│Twelve  │110    │2            │q2       │ δ:(q4,0)→q2 │
├────────┼───────┼─────────────┼─────────┼─────────────┤
│Thirteen│111    │3            │q3       │ δ:(q4,1)→q3 │
├────────┼───────┼─────────────┼─────────┼─────────────┤
│Fourteen│112    │4            │q4       │ δ:(q4,2)→q4 │
└────────┴───────┴─────────────┴─────────┴─────────────┘
Figure-12

Total number of edges in transition diagram figure-12 are 15 = Q × Σ = 5 * 3 (a complete DFA). And this DFA can accept all strings consist over {0, 1, 2} those decimal equivalent is divisible by 5.
If you notice at each step, in table there are three entries because at each step I add all possible outgoing edge from a state to make a complete DFA (and I add an edge so that q_r state gets for remainder is r)!

To add further, remember union of two regular languages are also a regular. If you need to design a DFA that accepts binary strings those decimal equivalent is either divisible by 3 or 5, then draw two separate DFAs for divisible by 3 and 5 then union both DFAs to construct target DFA (for 1 <= n <= 10 your have to union 10 DFAs).

If you are asked to draw DFA that accepts binary strings such that decimal equivalent is divisible by 5 and 3 both then you are looking for DFA of divisible by 15 ( but what about 6 and 8?).

Note: DFAs drawn with this technique will be minimized DFA only when there is no common factor between number n and base e.g. there is no between 5 and 2 in first example, or between 5 and 3 in second example, hence both DFAs constructed above are minimized DFAs. If you are interested to read further about possible mini states for number n and base b read paper: Divisibility and State Complexity.

^{below I have added a Python script, I written it for fun while learning Python library pygraphviz. I am adding it I hope it can be helpful for someone in someway.}

Design DFA for base 'b' number strings divisible by number 'n':

So we can apply above trick to draw DFA to recognize number strings in any base 'b' those are divisible a given number 'n'. In that DFA total number of states will be n (for n remainders) and number of edges should be equal to 'b' * 'n' — that is complete DFA: 'b' = number of symbols in language of DFA and 'n' = number of states.

Using above trick, below I have written a Python Script to Draw DFA for input base and number. In script, function divided_by_N populates DFA's transition rules in base * number steps. In each step-num, I convert num into number string num_s using function baseN(). To avoid processing each number string, I have used a temporary data-structure lookup_table. In each step, end-state for number string num_s is evaluated and stored in lookup_table to use in next step.

For transition graph of DFA, I have written a function draw_transition_graph using Pygraphviz library (very easy to use). To use this script you need to install graphviz. To add colorful edges in transition diagram, I randomly generates color codes for each symbol get_color_dict function.
#!/usr/bin/env python
import pygraphviz as pgv
from pprint import pprint
from random import choice as rchoice

def baseN(n, b, syms="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"):
    """ converts a number `n` into base `b` string """
    return ((n == 0) and syms[0]) or (
        baseN(n//b, b, syms).lstrip(syms[0]) + syms[n % b])

def divided_by_N(number, base):
    """
    constructs DFA that accepts given `base` number strings
    those are divisible by a given `number`
    """
    ACCEPTING_STATE = START_STATE = '0'
    SYMBOL_0 = '0'
    dfa = {
        str(from_state): {
            str(symbol): 'to_state' for symbol in range(base)
        }
        for from_state in range(number)
    }
    dfa[START_STATE][SYMBOL_0] = ACCEPTING_STATE
    # `lookup_table` keeps track: 'number string' -->[dfa]--> 'end_state'
    lookup_table = { SYMBOL_0: ACCEPTING_STATE }.setdefault
    for num in range(number * base):
        end_state = str(num % number)
        num_s = baseN(num, base)
        before_end_state = lookup_table(num_s[:-1], START_STATE)
        dfa[before_end_state][num_s[-1]] = end_state
        lookup_table(num_s, end_state)
    return dfa

def symcolrhexcodes(symbols):
    """
    returns dict of color codes mapped with alphabets symbol in symbols
    """
    return {
        symbol: '#'+''.join([
            rchoice("8A6C2B590D1F4E37") for _ in "FFFFFF"
        ])
        for symbol in symbols
    }

def draw_transition_graph(dfa, filename="filename"):
    ACCEPTING_STATE = START_STATE = '0'
    colors = symcolrhexcodes(dfa[START_STATE].keys())
    # draw transition graph
    tg = pgv.AGraph(strict=False, directed=True, decorate=True)
    for from_state in dfa:
        for symbol, to_state in dfa[from_state].iteritems():
            tg.add_edge("Q%s"%from_state, "Q%s"%to_state,
                        label=symbol, color=colors[symbol],
                        fontcolor=colors[symbol])

    # add intial edge from an invisible node!
    tg.add_node('null', shape='plaintext', label='start')
    tg.add_edge('null', "Q%s"%START_STATE,)

    # make end acception state as 'doublecircle'
    tg.get_node("Q%s"%ACCEPTING_STATE).attr['shape'] = 'doublecircle'
    tg.draw(filename, prog='circo')
    tg.close()

def print_transition_table(dfa):
    print("DFA accepting number string in base '%(base)s' "
            "those are divisible by '%(number)s':" % {
                'base': len(dfa['0']),
                'number': len(dfa),})
    pprint(dfa)

if __name__ == "__main__":
    number = input ("Enter NUMBER: ")
    base = input ("Enter BASE of number system: ")
    dfa = divided_by_N(number, base)

    print_transition_table(dfa)
    draw_transition_graph(dfa)
执行:
~/study/divide-5/script$ python script.py 
Enter NUMBER: 5
Enter BASE of number system: 4
DFA accepting number string in base '4' those are divisible by '5':
{'0': {'0': '0', '1': '1', '2': '2', '3': '3'},
 '1': {'0': '4', '1': '0', '2': '1', '3': '2'},
 '2': {'0': '3', '1': '4', '2': '0', '3': '1'},
 '3': {'0': '2', '1': '3', '2': '4', '3': '0'},
 '4': {'0': '1', '1': '2', '2': '3', '3': '4'}}
~/study/divide-5/script$ ls
script.py filename.png
~/study/divide-5/script$ display filename
输出:

DFA accepting number strings in base 4 those are divisible by 5

同样，输入base=4和number=7生成-dfa accepting number string in base '4' those are divisible by '7'
顺便说一句，尝试将 filename 更改为 .png 或 .jpeg 。

<子> 引用我用来编写此脚本的内容:
➊ "convert integer to a string in a given numeric base in python" 中的函数 baseN
➋ 安装“pygraphviz”:"Python does not see pygraphviz"
➌ 学习 Pygraphviz 的使用:"Python-FSM"
➍ 为每个语言符号生成随机的十六进制颜色代码:"How would I make a random hexdigit code generator using .join and for loops?"