scala - 进程间通信的静态类型参与者模型有什么不可行的地方吗?

标签 scala erlang akka hindley-milner actor-model

所以我最近才遇到玩具容量之外的 akka,尽管 scala 普遍偏爱静态类型,但我不禁注意到它和 OTP 共享动态类型。我开始四处挖掘,发现 this Wadler paper它描述了一个基于 erlang 进程间通信的 HM 类型系统。也就是说,an answer to this question from SO指的是 Wadler 和 Marlow 未能实现他们对过程型通信的草图。 作为引用,我对这样的代码很失望:

def receive = {
    case "test" => log.info("received test")
    case _ => log.info("received unknown message")
  }

我知道在实践中,dialyzer 可以提供真实类型系统的很多好处,但为什么创建静态验证的 actor 系统如此困难呢?只是我们倾向于编写 FutureObservable/Iteratee 库,或 Channel-modeled IO 而不是当我们使用类型系统时,是否存在 Actor 系统,或者是否存在 Wadler 和 Marlow 遗漏的技术难题?

最佳答案

为 Akka actors 世界带来类型安全是多年来一直在讨论和研究的事情。这些持续努力的当前体现是 Akka Typed API,可能会发生变化。

除了链接的文档之外,几年前 Akka 用户列表上的精彩讨论(标题为“如何协调非类型化 actor 与类型化编程?”)提供了对类型化 actor 的进一步洞察。阅读整个讨论 here对于整个上下文,但以下是一些摘录:


来自 Derek Wyatt:

What you're experiencing is a trade-off. Actors provide a trade-off that you don't seem to be taking into account; endpoints (Actors) are untyped and the messages that they handle are strongly typed.

You can't have an Actor be able to process "anything" with a type-specific receive method. With Actor programming, I should be able to add as many intermediaries in the message flow as I like and not disturb the two endpoints. The intermediaries should equally be ignorant of what's happening (load balancers, routers, loggers, cachers, mediators, scatter-gather, and so forth). You should also be able to move them around a cluster without disturbing the endpoints. You can also set up dynamic delegation in an Actor without it having to really understand what's going on - for example, an Actor that speaks the "main dialect" but delegates to something else when it doesn't understand what's being said.

If you want to eliminate all of those features, then you will be able to get the type-safe determinism you're looking for (so long as you stay in the same JVM - crossing JVMs will incur a "what the hell am I really talking to?" question that eliminates a compile time assurance)....

In short, you're giving up type safety in order to open the door to a whole new set of facilities. Don't want to lose the type-safety? Close the door :)


来自 Endre Varga:

The issue is that type systems are designed for local and not distributed computations. Let's look at an example.

Imagine an actor that has three states, A, B and C

  • In state A it accepts messages of type X, and when received one, it transitions to B
  • In state B it accepts messages of type X and Y. When X is received, transitions to C, if Y, then stays in B
  • In state C it accepts messages of type Z

Now you send to an actor starting from state A a message X. Two things can happen:

  • X is delivered, so the possible accepted types are {X, Y}
  • X is lost, so the accepted type is {X}

The intersection of those is {X}.

Now imagine that you send another message X. Three things can happen:

  • both X's were delivered, so the accepted type is {Z}
  • only one of the X's were delivered, the other is lost, so the accepted types are {X, Y}
  • both X's were lost, the accepted type is {X}

The intersection of the above cases is the empty set.

So what should be the local type representation of an actor that you have sent two messages of type X?

Let's modify the example, and assume that there was no message loss, but let's take the viewpoint of another sender. This sender knows that two X's were sent to our example actor by the other sender. What messages can we send? There are three scenarios:

  • both X's sent by the other sender has arrived already, so the accepted type is {Z}
  • only the first X sent by the other sender has arrived yet, so the accepted types are {X, Y}
  • no X's has arrived yet, accepted type is {X}

The intersection of the above cases is the empty set.

As you see, without receiving a reply from an actor, the provable type of an actor is usually Nothing, or something useless. Only replies can convey the possible type of an actor, and even that cannot be guaranteed if there are concurrent senders.


来自博士。罗兰·库恩:

I'm glad that you bring up this discussion, my desire to add some level of static typing to Akka is as old as my involvement with the project. If you look into the 1.x past you’ll find akka.actor.Channel[T] which was conceived with that in mind, and in 2.1 and 2.2 there were Typed Channels as a macro-based experiment. The latter actually crossed the line from thought experiment into code, and you are welcome to try it out to get a feeling for how static types interact with a very dynamic system.

The main shortcoming of Typed Channels was its inappropriate complexity (too many type parameters and too complex types—with type-level lists and maps—in them). We are gradually converging on a design which may strike the right balance, but in essence it means removing sender from Akka actors (which has also other very welcome benefits concerning closing over things in Future transformations). The gist of it is to parameterize ActorRef[T] with the type of message it accepts (with the obvious knock-on effects on Props[T], Actor[T] and so on). Then an Actor can expose references to itself with the appropriate type and that it sends to other actors—in specific messages in order to get around type erasure. This would even allow the formulation of message protocols, a.k.a. session types or at least close to it.

Derek made an excellent point about how the actor model really benefits from being unconstrained by types: a message router does not necessarily need to know anything about the messages passing through it. How well it works to parameterize the router itself remains to be seen, but in general such routing stages will destroy the type information, there is just not much we can do there. Your point that having some type-checking is better than none at all is one which resonates well with me, as long as the difference is really obvious to the developer: we must avoid a false sense of security.

This gets me to Endre's valid interjection that concurrent behavior is not accessible to static verification. The problem is much broader than message loss in that any nondeterministic action would have to result in a type disjunction, killing our nice static types through exponential explosion of the type structure. This means that we can only practically express using types those parts which are deterministic: if you send a message of type A to an actor, then you may get back a message of type B (which translates into having to supply an ActorRef[B] within the A message), where A and B typically are sum types like “all commands accepted by this actor” and “all replies which can possibly be sent”. It is impossible to model qualitative state changes of an actor because the compiler cannot know whether they will actually occur or not.

There is some light, though: if you receive message B, which includes an ActorRef[C] from the target, then you have evidence that the effect of message A has occurred, so you can assume that the actor is now in a state where it accepts message C. But this is not a guarantee, the actor might have crashed in the meantime.

Note how none of this depends on remote messaging. Your desire to split actors into a concurrency and a distribution part are very comprehensible, I used to think the same. Then I came to realize that concurrency and distribution are in fact the same thing: processes can only run concurrently if their execution is separated in space or time, which means being distributed, and on the other hand the finite speed of light implies that distributed processes will by definition be concurrent. We want encapsulation and compartmentalization for our actors, only communicating using messages, and this model means that two actors are always separated from each other, they are distributed even if they run on the same JVM (queues can run full, failures can occur, communication is not fully reliable—although its reliability is definitely a lot higher than in the network case). If you think about modern processors, the different cores and especially sockets are separated by networks as well, they are just a lot faster than your grand-dad’s gigabit ethernet.

This is precisely why I believe that the Actor model is exactly the right abstraction for modeling independent pieces in your applications now and in the future, since the hardware itself is going more and more distributed and actors capture just the essence of that. And as I argued above, I do see room for improvement on the static typing side of things.

关于scala - 进程间通信的静态类型参与者模型有什么不可行的地方吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47380721/

相关文章:

scala - Spark如何读取文件名开头带下划线的文件?

scala - 在 Slick 3 中,如何使用映射案例类 SQL 编译插入?

recursion - Erlang - 递归后列表/元组层次结构的问题

sockets - erlang中socket如何通过数据包发送数据?

c++ - 使用 erlang 构建 C++ 编译器

scala - 如何将 csv 文件作为 akka http 响应发送?

scala - Akka Stream Source.queue 的背压策略不起作用

json - Play 2.6中的Joda DateTime格式不起​​作用

c# - Akka.net:访问集群中的远程 Actors

java - 处理和保存模板中的 ManyToMany 数据 - Play Framework 2.0