Dp 110 fd #199

qnikst · 2015-06-16T16:31:57Z

No description provided.

qnikst · 2015-06-16T16:32:16Z

Please rebase on the top of master

mboes · 2015-06-17T14:55:17Z

I'd like to see a proper performance analysis of whether the thread pool improves or impedes performance in any significant way before this goes in.

qnikst · 2015-06-17T14:58:13Z

This issue is not about performance, this is about non-blocking operations, and threadpool here guarantees ordering of operations. So I'm not sure what we can measure here..

facundominguez · 2015-06-17T15:27:07Z

We could measure a node with 2 processes each sending many messages to a process in one of two other nodes. Isolate one of the target nodes and observe how fast the messages arrive at the other.

This patch should make it orders of magnitude faster.

qnikst · 2015-06-18T09:18:38Z

Related to #204

qnikst · 2015-06-18T16:45:10Z

@mboes I've implemented a benchmark in a setting suggested by @facundominguez

main program that pings 2 nodes (one is hardcoded), and reply service (code below).
I've started them in 2 different containers communication was disabled by docker rules,
the difference was in oder of magnitude:

30s w/o patch
and 3s with patch.

not sure if I need to include this benchmark as it require many manual steps to run it.

UPD. with -threaded I don't observe this difference. Possibly better benchmarks may needed.

import System.Environment
import Control.Monad
import Control.Applicative
import Control.Distributed.Process
import Control.Distributed.Process.Node
import Control.Concurrent
import Network.Transport.TCP (createTransport, defaultTCPParameters)
import Data.Binary (encode, decode)
import qualified Data.ByteString.Lazy as BSL
import qualified Data.ByteString.Char8 as BS
import Network.Transport (closeTransport, EndPointAddress(..))
import Network.Transport.TCP
import Control.Distributed.Process
import Control.Distributed.Process.Internal.Types

num :: Int
num = 10000

main = do
  [h1,h2,h3] <- getArgs
  [Right t1,Right t3] <- mapM (\h -> createTransport h "0" defaultTCPParameters) [h1,h3]
  [n1,n3] <- mapM (\t -> newLocalNode t initRemoteTable)
                     [t1,t3]

  let p1 = ProcessId (NodeId $ EndPointAddress $ BS.pack "172.17.42.1:9999:0") (LocalProcessId 0 10)
  print p1

  addr3 <- newEmptyMVar
  forkIO $ runProcess n3 $ getSelfPid >>= (liftIO . putMVar addr3) >> forever (expect >>= flip send ())

  forkIO $ runProcess n1 $ do
    x <- getSelfPid
    forever $ usend p1 x
  runProcess n1 $ do
    p <- liftIO $ takeMVar addr3
    x <- getSelfPid
    replicateM_ num $ do
      send p x
      expect :: Process ()

import System.Environment
import Control.Monad
import Control.Applicative
import Control.Distributed.Process
import Control.Distributed.Process.Node
import Control.Concurrent
import Network.Transport.TCP (createTransport, defaultTCPParameters)
import Data.Binary (encode, decode)
import qualified Data.ByteString.Lazy as BSL
import qualified Data.ByteString.Char8 as BS
import Network.Transport (closeTransport, EndPointAddress(..))
import Network.Transport.TCP
import Control.Distributed.Process
import Control.Distributed.Process.Internal.Types

main = do
  [h] <- getArgs
  Right t <- createTransport h "9999" defaultTCPParameters
  n <- newLocalNode t initRemoteTable
  runProcess n $ do
    say . show =<< getSelfPid
    forever $ do
      p <- expect
      say $ show p
      send p ()

qnikst · 2015-06-18T18:07:34Z

Another update without patch send between local processes never (I couldn't wait so much) happen,
with patch everything work reliably

import Control.Monad
import Control.Distributed.Process
import Control.Distributed.Process.Node
import Control.Concurrent
import Network.Transport.TCP (createTransport, defaultTCPParameters)
import Network.Transport (EndPointAddress(..))
import qualified Data.ByteString.Char8 as BS

num :: Int
num = 10000

main :: IO ()
main = do
  Right t <- createTransport "127.0.0.1" "0" defaultTCPParameters
  n <- newLocalNode t initRemoteTable

  _ <- forkIO $ runProcess n $ do
    forever $ do
      mref <- monitorNode $ NodeId $ EndPointAddress (BS.pack "198.164.0.4:8923:0")
      unmonitor mref

  runProcess n $ do
    self <- getSelfPid
    localPid <- spawnLocal $ forever $ send self () >> (expect :: Process ())
    replicateM_ num $ do
      () <- expect
      say "."
      send localPid ()

facundominguez · 2015-06-18T18:48:21Z

The last benchmark looks like what I was expecting.

@mboes we had to change the benchmark because in my initial proposal (and first @qnikst attempt) the NC was not involved.

A pool of threads is used for network operations. One thread is used per NodeId.

qnikst · 2015-06-22T08:06:04Z

PTAL, I have rebased patch on the top of the current master, so it could be merged automatically.

mboes · 2015-06-22T08:46:49Z

This issue is not about performance, this is about non-blocking operations, and threadpool here guarantees ordering of operations. So I'm not sure what we can measure here..

Except that the introduction of yet another layer of synchronized communication between threads is something you have to pay for each send, not just the problematic ones when there are network problems. The "benchmarks" you wrote earlier after @facundominguez proposed them are not, in my book, benchmarks (case in point, you don't have meaningful relative performance comparisons) - they are tests exhibiting the problem that this patch proposes one way to fix.

A benchmark here would demonstrate that communication latencies are not adversely affected in a high speed network, even using say n-t-inmemory, under a variety of stressful scenarios (many processes each sending 1 message simultaneously to the same target, processes sending to multiple target nodes simultaneously, one process sending many messages etc).

In any case we've discussed this patch in the past and I'm not convinced that this is the right approach. Before investing significant resources into this and going full speed ahead, can we get @dcoutts and @edsko's feedback? Guys, the problem is this: current implementations of the network-transport API are not fully asynchronous, nor does the API documentation mandate that. Therefore @facundominguez is proposing to have distributed-process maintain a per endpoint threads to manage the send queue, so that the node controller is never affected by the lack of asynchronicity. But one could also argue that the problem here is that network-transport backends should be fully asynchronous when connecting and sending messages, so that upstream libraries like distributed-process don't have to paper over things in this way, and effectively introduce yet another (potentially unbounded) buffer in the communication path. What do you think?

qnikst · 2015-06-22T10:30:29Z

@mboes we generic microbenchmarks in d-p so we could run that code and check difference, in my first benchmark I didn't have any measurable difference in general case. So if running benchmarks worth it before other discussions I can gather data.

edsko · 2015-06-22T15:11:36Z

I haven't looked at the detail yet, but I'm not very keen on introducing an unbounded buffer into the pipeline. We tried hard to avoid unbounded buffers.

mboes · 2015-10-19T08:57:37Z

@edsko @facundominguez ok then, based on @edsko's latest feedback, do we agree that this issue is best solved at the network-transport level, and this PR should be closed?

facundominguez · 2015-10-19T11:28:16Z

We agree that we should solve this at the network-transport level.
The alternatives are either to merge this in the meantime or to live with d-p as is. Either way is fine with me.

facundominguez · 2015-10-23T17:39:52Z

Either way is fine with me.

After a second thought, I think merging in the meantime might be more useful to users of CH than keeping the code as is.

mboes · 2015-10-26T06:56:42Z

This is a fair amount of imprecisely quantified technical debt we'd be committing to master. I say imprecisely unquantified because the benchmarks discussion above is still unresolved. Could documenting this infelicity in the design of d-p/n-t to the user instead be a good enough stopgap? There are ways for the user to mitigate (not remove yes, but mitigate...) this problem, in particular by setting aggressive timeouts in n-t-tcp in the meantime.

dcoutts · 2016-02-29T23:14:14Z

Yes, without having gone into the details too much it does sound like something for the NT level API. We do not want to adversely affect the fast path of sending because of occasional expensive slow operations like establishing new heavyweight connections.

qnikst added this to the distributed-process-0.6 milestone Jun 18, 2015

qnikst added the enhancement label Jun 18, 2015

qnikst mentioned this pull request Jun 19, 2015

Implement monitoring for registry values #184

Closed

Fix for DP-110. The NC does not block anymore on network operations.

1fe0e67

A pool of threads is used for network operations. One thread is used per NodeId.

mboes added the discuss label Jun 22, 2015

facundominguez mentioned this pull request Jul 16, 2015

[DP-110] NC blocks when remote nodes are unavailable #204

Open

facundominguez closed this Jan 29, 2016

facundominguez deleted the dp-110-fd branch January 29, 2016 18:01

qnikst modified the milestones: distributed-process-0.6, distributed-process-0.7 Feb 24, 2016

facundominguez mentioned this pull request Jan 29, 2016

Carefully document semantics. #407

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dp 110 fd #199

Dp 110 fd #199

qnikst commented Jun 16, 2015

qnikst commented Jun 16, 2015

mboes commented Jun 17, 2015

qnikst commented Jun 17, 2015

facundominguez commented Jun 17, 2015

qnikst commented Jun 18, 2015

qnikst commented Jun 18, 2015

qnikst commented Jun 18, 2015

facundominguez commented Jun 18, 2015

qnikst commented Jun 22, 2015

mboes commented Jun 22, 2015

qnikst commented Jun 22, 2015

edsko commented Jun 22, 2015

mboes commented Oct 19, 2015

facundominguez commented Oct 19, 2015

facundominguez commented Oct 23, 2015

mboes commented Oct 26, 2015

dcoutts commented Feb 29, 2016

Dp 110 fd #199

Dp 110 fd #199

Conversation

qnikst commented Jun 16, 2015

qnikst commented Jun 16, 2015

mboes commented Jun 17, 2015

qnikst commented Jun 17, 2015

facundominguez commented Jun 17, 2015

qnikst commented Jun 18, 2015

qnikst commented Jun 18, 2015

qnikst commented Jun 18, 2015

facundominguez commented Jun 18, 2015

qnikst commented Jun 22, 2015

mboes commented Jun 22, 2015

qnikst commented Jun 22, 2015

edsko commented Jun 22, 2015

mboes commented Oct 19, 2015

facundominguez commented Oct 19, 2015

facundominguez commented Oct 23, 2015

mboes commented Oct 26, 2015

dcoutts commented Feb 29, 2016