Tuesday January 18, 2011

[Time] NameMessage
[00:30] traviscline i'm investigating getting the pyzmq polling integrated into gevent (cython wrapping of libevent) and wanted to ask if anyone had any 'to-reads' other than the eventlet implementation
[00:31] mikko i think there was someone here who was integrating with gevent
[00:32] mikko the process is very simple
[00:32] mikko i can walk you through it if you like?
[00:34] traviscline mikko: yeah if you have any input that'd be great
[00:35] mikko so, first you need zeromq 2.1.x
[00:35] traviscline rgr
[00:35] mikko what you need to do roughly is:
[00:35] mikko you create a zmq socket and connect/bind it
[00:35] mikko you call getsockopt ZMQ_FD on the socket to get a filehandle
[00:36] mikko then you add gevent io watcher on that filehandle
[00:36] mikko and your callback should be signaled when the socket is readable/writable
[00:36] traviscline that list posting discouraged me from taking that path
[00:36] traviscline but thanks, I'll give that a go
[00:37] mikko zeromq doesnt support io completion ports
[00:37] mikko that is true
[00:38] traviscline mikko: shit sorry, meant this
[00:39] mikko he is probably missing a fact that zeromq is edge-triggered
[00:39] mikko so when gevent signals that socket is readable you need to read until you get EAGAIN
[00:40] mikko so do nonblocking recv in a loop (in the callback) and break out from loop when you get EAGAIN from recv
[00:40] mikko edge-triggered means that the callback gets called on state changes, not necessarily on every new message
[00:43] traviscline *nods*
[00:45] traviscline mikko: !! thanks
[00:45] traviscline works
[00:48] mikko cool
[00:51] traviscline mikko: straight port as of now of eventlet's stuff
[00:51] traviscline but works
[00:51] traviscline on initial testing
[11:02] mikko sustrik_: im having slightly odd behavior with HWM
[11:03] mikko i am trying to word it
[11:34] Steve-o mikko: made advances on PGM on Windows
[11:34] Steve-o I have CMake working again, and can even make a GUI installer pretty easily
[11:34] mikko Steve-o_: nice!
[11:34] Steve-o its like a few lines of CMake to build a NSIS installer package
[11:35] mikko i wonder how windows install is best handled
[11:35] mikko currently we got mingw and MSVC builds
[11:35] mikko with mingw you should be able to reuse the autoconf builds
[11:35] mikko but not sure about MSVC
[11:35] Steve-o shared libraries and exes wont build for me, maybe because of x86/x64 difference with the compiler, with CMake that is
[11:36] Steve-o well I've gone for autoconf for unix and cmake for windows, and scons for development
[11:36] Evet is zeromq right tool to build a production http server?
[11:36] Steve-o Evet: have a look at Mongrel
[11:36] Steve-o
[11:38] Steve-o mikko: I have cygwin on Windows but not tried mingw32, I usually cross compile with that instead
[11:38] Steve-o due to Python in Cygwin crashing on fork constantly
[11:38] Evet Steve-o_: thanks. it looks promising. but, do you have a suggestion for c?
[11:39] mikko mingw32 wont build
[11:39] mikko currently it's missing group_source_req in ws2tcpip.h
[11:39] mikko i debugged this a bit over the weekend
[11:39] Steve-o I have a lot of patches
[11:39] Steve-o two sets, one for mingw32 and one for mingw-w64
[11:39] mikko it seems that mingw64 includes these upstream (based on #mingw)
[11:39] mikko but it has not been synced to mingw32
[11:40] Steve-o not all of them though, at last check
[11:40] mikko which sounds slightly strange
[11:40] Steve-o
[11:40] Steve-o and
[11:40] mikko Evet: what would you use zeromq for in http server?
[11:40] mikko Steve-o_: yeah, i stumbled onto those over the weekend
[11:41] mikko if you google 'struct group_source_req mingw' they are about the only results
[11:41] Steve-o you also need to force WCACMSG header thingy too
[11:42] Steve-o cmsghdr or _WSACMSGHDR or wsacmsghdr depending on the compiler
[11:44] Steve-o I think that's how the Wine developers have noted my project, I've seen it mentioned a few times in their bug tracker
[11:45] Steve-o Evet: are you after a bespoke zmq-http forwarder (router/gateway)?
[11:45] Steve-o Evet: its certainly fast enough, but it all depends what you requirements are beyond or aside to Mongrel
[11:46] Evet Steve-o_: i need to embed to my c application
[11:47] Steve-o mikko: the mingw-w64 team still haven't released a version yet though have they? that's why I picked one random version and stuck with it
[11:47] Evet currently using libevent's http module. but zeromq looks cleaner
[11:48] Steve-o ok
[11:48] Steve-o I've been looking at using libevent or libcurl to implement a basic integration of HTTP and 0MQ
[11:49] Steve-o but you also have Boost which has a lot of scalability features already in for HTTP multi-core usage
[11:49] Steve-o is this a high or load load HTTP server though?
[11:51] Evet Steve-o_: it needs to handle high loads
[11:51] Evet Boost's ASIO module looks great, but i dont really know about c++'s oop thing
[11:52] Steve-o otherwise integrating with Nginx might be more likely
[11:54] mikko Evet: you can't really communicate with clients over zeromq
[11:54] mikko Evet: i've created libevent http based webserver but it uses zeromq for inter-thread communication
[11:55] Evet hmm
[11:56] Steve-o I wrote my own HTTP admin interface for PGM using async-io, but its definitely not multi-core IO scalable
[11:56] Steve-o they are two very large different domains
[11:57] Steve-o for basic C you can also use libsoup
[11:58] Steve-o but application and HTTP server don't really go together, you should use a dedicated http-zmq gateway and a zmq-based application server for the core logic
[12:03] mikko Steve-o_: i guess in which case you could just use mongrel2
[12:04] Evet in fact, a request-reply tcp server is sufficient for me
[12:04] Steve-o correct, but the question comes back to what is high load
[12:04] Steve-o thousands of requests per second?
[12:05] Evet 8k requests/second per cpu core
[12:06] Steve-o then typical design you would have basic edge gateways managing the HTTP requests and core application servers processing ZMQ messages at high speed
[12:08] Evet i can implement RFC rules
[12:08] Evet i have wrote some core modules for nginx, but its overcomplicated for a single application
[12:11] mikko does it have to be http?
[12:11] Steve-o in comparison Wikipedia is up to 90k requests per second for hundreds of servers (300+)
[12:12] mikko they run tons of squid servers iirc
[12:13] Evet mikko: no, i can handle http parsing
[12:17] Steve-o is this with an additional load balancer in front?
[12:19] Steve-o the point being in regular HTTP traffic 8k/s is quite high for even one machine
[12:20] Evet a quad-core desktop pc can handle ~30k non-keepalive requests per second
[12:22] Steve-o it all depends what you are serving though
[12:22] Steve-o which is why it's rather difficult to help you out
[12:22] Evet dynamic content through embedded caching
[12:23] Evet i have reached 90k req/sec with keepalive with an in-memory hashtable library
[12:24] Steve-o ok, so basically a higher protocol memcached? closer to amazon s3?
[12:25] Evet not really
[12:25] Steve-o :D
[12:25] Evet an embedded database library without ACID overhead
[12:26] mikko in-memory hashtable that has ACID ?
[12:26] Evet ofcourse not :)
[12:26] mikko memcached really is nothing more than a distributed hashtable
[12:26] Steve-o amazon simpledb then?
[12:26] mikko distributed in the very lose definition of the term
[12:27] Evet im going to use zeromq for brokerless replication
[12:28] Evet tokyo cabinet as embedded database library, which also able to append in-memory hashtable to disk
[12:28] mikko you should look into kyoto cabinet as well
[12:29] Evet have been using nginx, but its overcomplicated
[12:29] mikko i'm testing kyoto cabinet in current project
[12:29] Evet mikko: really? im testing kyoto cabinet for months too. nice to meet another kyoto* user
[12:30] Steve-o didn't Oracle release their replication transport recently
[12:30] mikko i remember tokyo cabinet hash database is O(log n) for retrieval?
[12:30] Evet mikko: nope. o(1)
[12:31] mikko Evet: but with hash database you have collisions
[12:31] mikko i don't see how you handle collision in O(1)
[12:32] Evet mikko: im generating uuid. but, is it what you asked?
[12:32] mikko no
[12:33] Evet could you rephrase then, im not good at english
[12:33] mikko if you don't know all the keys beforehand it's impossible to create perfect hash function
[12:34] mikko
[12:35] mikko Steve-o_: have you had issues with zmq_poll ?
[12:36] mikko im seeing weird behavior that i have reached HWM but the socket has revents ZMQ_POLLOUT
[12:36] Steve-o haven't used it yet
[12:37] mikko and i seem to be losing messages somewhere
[12:37] mikko need to debug further to see whether it's actually my software causing this
[12:37] mikko anyway, lunch time. bbl
[12:39] Evet mikko:
[12:39] Evet hmm
[12:49] Evet Steve-o_: so in sum of; is zeromq suitable to write an asynchronous request-respond tcp server?
[13:47] zchrish I have a realtime server connected and am sending packets at least once per second. But my subscriber is receiving them only every few seconds. I assume this is due to the NAGLE algorithm; could this be the case?
[13:47] sustrik zchrish: nagle is turned off
[13:48] sustrik 0mq should definitely not behave that way
[13:48] sustrik do you have a minimal test case?
[13:48] zchrish I see. I am sure it is somewhere else then. Thank you.
[13:49] sustrik np
[13:49] zchrish Actually I am converting my server code over to zeromq and probably didn't activate my heartbeat.
[14:29] ptrb there should be no problem dynamically connecting and disconnecting an active ZMQ_SUB socket to various ZMQ_PUB sockets, right?
[14:30] ptrb hmm, except there is no disconnect :)
[14:58] CIA-21 zeromq2: 03Martin Sustrik 07master * r56bdba5 10/ (8 files):
[14:58] CIA-21 zeromq2: Fix cppcheck warnings: Prefer prefix ++/-- operators for non-primitive types.
[14:58] CIA-21 zeromq2: Signed-off-by: Martin Sustrik <> -
[14:59] mikko sustrik_: i'm seeing something weird
[14:59] sustrik yes?
[14:59] mikko sustrik_: effectively what i am trying to do is a device that stores messages if HWM is reached
[14:59] mikko PULL/PUSH sockets over tcp
[14:59] sustrik right
[15:00] mikko my hwm on the PUSH socket is set to 5 for testing
[15:00] mikko i connect producer to pull socket
[15:00] mikko send 100 messages and i can see five being within the zeromq buffer and 95 go to the persistent storage
[15:00] mikko now i connect a consumer that consumes five messages
[15:01] mikko well, i bind a consumer and the device connects to it
[15:01] mikko the consumer process exits after consuming five messages
[15:01] mikko so my assumption was that 5 messages should now go to socket, it would hit hwm and return EAGAIN
[15:02] mikko but what i see:
[15:02] mikko the out_socket is constantly signaling ZMQ_POLLOUT
[15:02] mikko and it keeps accepting messages
[15:02] sustrik that's because the messages are stored in TCP buffers at that moment
[15:02] mikko until my persistent store is empty and turn off polling on outsocket
[15:03] sustrik so 0mq's queue is empty
[15:03] mikko then, i connect the consumer again
[15:03] mikko which is blocked on recv and no messages coming
[15:08] mikko is that the expected behavior?
[15:22] sustrik i think so
[15:22] sustrik the messages are stored in TCP buffers
[15:22] sustrik thus 0MQ buffers are empty
[15:23] sustrik you can set the size of TCP buffers using ZMQ_SNDBUF/ZMQ_RCVBUF
[15:25] mikko i'll give that a go
[15:38] mikko sustrik_: i set the SNDBUF to 10 on the PUSH socket and still seeing the same behavior
[15:38] mikko it could be just something silly im doing as well
[15:39] sustrik i think the OS it not guaranteed to limit the TCP buffer to the value you supply
[15:39] sustrik it's more of a hint
[15:39] mikko
[15:39] mikko DataStore is the implementation or persistent storage
[15:39] mikko hmm
[15:39] sustrik 10-byte TCP buffer seems strange
[15:39] mikko ill test with larger messages in a mit
[15:39] mikko min*
[15:40] mikko hmm
[15:41] mikko 10KB messages i still lose some messages
[15:41] mikko but not as many
[15:41] sustrik lose?
[15:41] mikko consume 5 messages and lose 10 - 20 messages in between
[15:41] mikko the consumer never receives them
[15:42] mikko i got sequence in each message
[15:42] sustrik that looks like a bug
[15:42] mikko when i consume the first five i get 0 - 4
[15:42] mikko next time i might get 23 onwards
[15:43] sustrik is there 1 connection involved?
[15:43] mikko yes
[15:43] sustrik or 2 of them?
[15:43] mikko 1
[15:43] sustrik hm
[15:43] mikko odd thing:
[15:43] sustrik do you restart either peer?
[15:43] mikko yes
[15:43] mikko the consumer is a script
[15:43] mikko it consumes 5 and exits
[15:44] sustrik then there's another connection created?
[15:45] mikko yes
[15:45] mikko consume 5 at a time
[15:45] sustrik i see
[15:45] sustrik the messages are presumably dispatched to the old connection
[15:45] sustrik and are dropped when the application exits
[15:46] sustrik thus you see gaps in the sequence
[15:47] mikko is there any merit in XPUSH/XPULL sockets where the communication is two way, a bit like XPUB/XSUB forwarding but rather for a small ACK that the message has been received
[15:48] mikko i think a script that consumes five messages or so would not be a unique use-case for scripting languages
[15:48] mikko as the processes are often short-lived
[15:53] mikko and with load-balancing it's very hard to rely on seq
[15:56] guido_g or retrieve and re-dispatch the messages when a new connection is established
[15:57] mikko are the messages currently dropped after zeromq?
[16:00] guido_g as far as i understood what sustrik_ said, they're dropped when the connection closes
[16:02] mikko i understood that they are already in the network buffer (out of reach for zeromq)
[16:03] guido_g yes, this is the main part of the problem
[16:04] guido_g at least for small messages
[16:06] guido_g it seems the meta-pattern for ømq is that you always need at least one other socket to manage the one you care about
[16:07] mikko hmm
[16:08] sustrik to get precise hwm, duplicit ack mechanism can be implemented on 0mq level
[16:09] mikko i see some merit to that
[16:09] sustrik thus, TCP would ack packets, whereas 0mq would ack messages
[16:10] mikko sustrik_: im trying to create a device which would allow replaying streams
[16:11] sustrik what does that mean exactly?
[16:11] mikko well, guys i know are looking at kafka and it provides a mechanism to replay N minutes of rstream
[16:11] mikko so a persistent storage is involved there
[16:12] mikko what i was planning is to store messages and push them out to normal push socket and have separate XREP socket where you can ask for "deltas"
[16:12] mikko so if a consumer needs N amount of data to be productive you could request last 1000 messages or so
[16:12] mikko and then start consuming the pull feed
[16:13] sustrik how does that work with PUSH socket?
[16:13] sustrik shouldn't it be PUB?
[16:13] guido_g something like ZMQ_RECOVERY_IVL?
[16:13] mikko it could be PUB as well but with PUB i have no information when HWM is reached
[16:14] mikko i started by creatign a device which writes to store when hwm is reached
[16:14] guido_g hwm is per connected pull, right?
[16:14] mikko yes
[16:14] guido_g so there is no overall hwm on the push side
[16:14] mikko no
[16:15] guido_g then i cant figure out why you need hwm here
[16:15] mikko guido_g: the behavior for PUB is to drop messages when there are no consumers
[16:16] guido_g ack
[16:16] Evet mikko: are you going to use kyoto cabinet as cache server?
[16:16] mikko guido_g: by writing to a PUB socket i don't really know if consumer has got it
[16:16] guido_g mikko: i know
[16:16] guido_g but you need a per client sequence management anyway
[16:17] mikko guido_g: i dont need all clients receiving all messages
[16:17] sustrik but what you want to do is to distribute the messages to *all* consumers, not load balanace them among consumers, right?
[16:17] mikko guido_g: im not maybe explaining this well
[16:17] guido_g mikko: and don't follow you well :)
[16:17] sustrik maybe explain the use case
[16:18] mikko so, what i am mixing up here is the end-system and what i have now. in the end-system i will have two kinds of consumers
[16:18] mikko consumer A is consuming from PUSH socket and receives every Nth message based on load-balancing
[16:19] mikko and consumer B which might want messages from last 20 minutes
[16:19] mikko the B would be XREP/XREQ i guess
[16:19] guido_g right
[16:19] sustrik so B only wants a log of message
[16:20] sustrik some of those are already processed etc.
[16:20] guido_g and the A type is not allowed to loose messages if one client crashes
[16:20] mikko guido_g: yes
[16:20] mikko sustrik_: yes
[16:20] guido_g ah ok
[16:20] mikko i need to be sure that each message is processed at least once
[16:20] mikko and some _might_ be processed multiple times
[16:20] sustrik mikko: there's no way to solve that
[16:21] guido_g except w/ a control socket per client
[16:21] sustrik it's the classis "guaranteed delivery" problem
[16:21] sustrik when failure happens there are always some messages in "dubious" state
[16:22] mikko sustrik_: i am fine with that
[16:22] mikko sustrik_: but in the current situation i lose messages in "normal" operation
[16:22] sustrik i see
[16:22] mikko a script connecting, consuming 5 and exiting
[16:22] mikko another script connecting, consuming 10 and exiting
[16:22] mikko i lose large amount of messages there
[16:23] mikko by lose i mean i have no visibility where they have gone
[16:23] mikko from my device point of view i have sent them and from consumer point of view nothing has been sent
[16:24] sustrik right, the only solution is to implement acks at 0mq level
[16:24] mikko if the consumer sent back "got it in 0mq, thanks" it would be enough for normal operation
[16:24] mikko yes
[16:24] mikko i don't think this is required for all use-cases but certainly it seems useful for scripting languages
[16:25] stimpie Iam running an experiment where several threads send messages to one other thread using tcp. All sending threads have setHWM(2). When I suspend the receiving thread the others keep sending.
[16:25] guido_g also for the classic butterfly pattern
[16:26] stimpie If I resume the receiving thread after more then 100 sent messages it receives them all.
[16:26] sustrik mikko, guido_g: yes, it's specific to push/pull pattern
[16:26] stimpie I was expecting only 2 messages would be queued
[16:26] sustrik stimpie: you have to use latest version of 0mq from github and set HWM on both sending and receiving side
[16:27] sustrik mikko, guido_g: i see two options here
[16:27] sustrik 1. standard acks
[16:27] guido_g i'm atm writing something for this in python, a thing that acts as start and endpoint of a butterfly like systems
[16:28] sustrik the obvious problem with acks is that if the peer exits without acking all the messages dispatched to it, those have to be rescheduled
[16:28] sustrik to another peer
[16:28] sustrik thus ordering is not preserved
[16:28] guido_g right, so one need to keep a backlog
[16:29] guido_g right
[16:29] sustrik for example, peer may get messages 1,2,3,7,8,9,4,5,6
[16:29] sustrik another option would be implementing an explicit shutdown handshake
[16:29] guido_g but you can't preserve order on a push -> multiple pull system anyway
[16:30] sustrik client says "i am about to exit"
[16:30] guido_g ahh like shutdown(1)
[16:30] sustrik then it consumes all remaining messages
[16:30] sustrik then it exits
[16:30] sustrik there's no re-ordering problem there
[16:30] mikko so like linger for incoming
[16:30] guido_g nice, but wouldn't help much in case if failure
[16:30] guido_g *of failure
[16:30] sustrik but, otoh, the shudown sequence may hang up
[16:31] sustrik yes, same as linger, just in opposite direction
[16:31] sustrik yes, it would work only for orderly shutdown
[16:31] sustrik however, reliable delivery in case of failure is a myth
[16:32] sustrik the problem can be mitigated, but never solved
[16:33] guido_g right
[16:34] guido_g but the mitigation would/does help a lot in most cases
[16:34] mikko i am inclined to say that ACK makes it a bit more resilient. maybe we can document that the trade-off is that delivery order will not be guaranteed
[16:35] sustrik yes, possibly
[16:35] mikko also there is a throughput trade-off as well
[16:35] mikko but there always is
[16:35] sustrik actually, with butterfly pattern there's no overall ordering guaranteed anyway
[16:36] guido_g as i said
[16:36] sustrik if you have at least 2 workers
[16:36] guido_g hehe
[16:36] sustrik they can process messages at different speeds
[16:36] sustrik and thus mix the stream
[16:36] sustrik when it gets joined in the next step
[16:37] sustrik mikko: i don't think there's much of throughput impact
[16:37] sustrik the acks can be sent oportunistically
[16:37] sustrik just once in a while
[16:37] sustrik thus having close to zero performance impact
[16:40] mikko hmm, that sounds pretty good
[16:41] mikko is the same infrastructure that is used in subscription forwarding suitable for this?
[16:41] sustrik the nice thing is that at most 1 message would be lost even in the case of failure
[16:42] sustrik the one that was being processed at the moment
[16:42] sustrik mikko: not really
[16:42] sustrik different functionality is needed
[16:43] sustrik the socket would have to keep list of sent but unacked messages
[16:43] sustrik trim it when ack is received
[16:43] sustrik and resend the messages in case of connection failure
[16:44] sustrik there are some strange corner cases involved
[16:44] sustrik say, what if connection fails and there's no other connection to resend the messages?
[16:45] mikko buffer the messages and honor HWM?
[16:45] sustrik dunno
[16:45] sustrik i'm just thinking aloud
[16:46] mikko there also might be duplicate delivery
[16:46] mikko i guess
[16:46] sustrik yes
[16:46] mikko unless you ACK the ACK
[16:46] mikko then you need to ACK the ACK ACK with ACK
[16:47] sustrik as i said
[16:47] sustrik it's unsiolvable problem
[16:47] sustrik you can mitigate it ba adding more acks
[16:47] mikko but if we optimize for normal operation
[16:47] sustrik and ackacks etc;
[16:48] mikko i think ACK is good enough for majority of the cases
[16:48] sustrik yes
[16:50] traviscline if there are any geventers:
[16:51] traviscline mikko: thanks again for the input, going to get a little perf bench set up and cythonify it
[18:30] lechon hello, is anyone having problems with the lua binding?
[18:31] lechon i just installed a fresh zeromq, lua, and lua-zmq bindings:
[18:32] lechon zmq.DOWNSTREAM is nil?
[18:35] guido_g ouch
[18:35] guido_g DOWNSTRWAM is old
[18:35] guido_g very old
[18:35] guido_g so i guess the bindings are not up to date
[18:36] lechon ohh
[18:36] lechon i see PUSH and PULL work
[18:37] lechon whoops
[18:37] guido_g case closed :)
[19:55] ngerakines hey folks
[19:55] ngerakines I've got a question about connection timeouts
[19:55] ngerakines anyone around?
[19:59] traviscline ngerakines: general irc etiquette is to just ask, don't ask to ask
[19:59] ngerakines fair enough
[20:23] cremes ngerakines: so what's your questino?
[20:28] lechon is any kind of unreliable transport like udp supported?
[20:29] cremes lechon: not right now but new transports can be added
[20:29] cremes search the mailing list for earlier discussions
[20:29] cremes i think the main devs want to clean up that api a bit to make this easier; having someone actually work on adding UDP
[20:30] cremes would be a great exercise for doing that cleanup
[20:30] mikko udp is slightly problematic for certain semantics
[20:31] lechon is there some kind of state overhead that needs to be reliable?
[20:32] mikko lechon: well, for example it's guaranteed that you will only receive full messages
[20:32] mikko sending larger messages over udp means that all packets must arrive
[20:32] mikko if you lose a packet in the middle you need retransmission
[20:33] lechon handling it the naive way would just be inefficient
[20:33] lechon yet probably acceptable for applications electing to use udp (where perhaps a single dropped packet renders the entire message useless)
[20:36] mikko there hasn't been that much talk about UDP to be fair
[20:36] mikko i remember there has been discussion about SCTP
[20:36] mikko and a few others
[20:38] mikko i remember UDT being mentioned at some point
[20:39] lechon i was able to find these two on the mailing list archive:,
[20:40] mikko yes, you can use openpgm over udp
[20:40] mikko but that's not strictly UDP semantics
[20:41] lechon i'm trying to stream over the internet and can cope with lost packets so unreliable would be preferable
[20:53] ngerakines Sorry, got pulled into a meeting
[20:54] ngerakines Is there any further documentation on handling connection timouts?
[20:55] mikko ngerakines: no, not really
[20:55] ngerakines bummer
[20:55] mikko ngerakines: what sort of situation?
[20:55] mikko ngerakines: in most cases you shouldn't care about connection timeouts etc
[20:55] mikko as zeromq takes care of reconnecting under the hood
[20:56] ngerakines I've got a small client executable that creates a request to a daemon that may or may not be up, but when it isn't, the app hangs until a connection can be established
[20:56] mikko you can use zmq_poll
[20:56] ngerakines with my understand of things, the next thing to do would be use a while loop and zmq_poll ... yeah
[20:57] mikko or you can do non blocking send
[20:57] ngerakines reference?
[20:57] mikko depends on what you want to do if the daemon is not up
[20:57] mikko wait or just forget about it
[20:59] ngerakines just forget about it is preferable
[21:00] mikko ok, then you can use a non-blocking send
[21:00] mikko which language are you using?
[21:00] ngerakines this executable is called hundreds of thousands to millions of times a day
[21:00] ngerakines c++
[21:00] ngerakines so when it hangs, it can cause system/resource issues
[21:00] mikko pass ZMQ_NOBLOCK as second arg to ->send ()
[21:00] ngerakines ok
[21:00] ngerakines thanks much!
[21:00] mikko it will return false if the message was not sent
[21:00] mikko and errno will be set to EAGAIN
[21:01] ngerakines great, i'll readup on it as well
[22:03] lechon mikko, from browsing the source a i get the feeling that udp versions of tcp_connecter and tcp_listener would need to be written
[22:09] mikko and possibly tcp_socket
[22:09] mikko not sure if that is identical
[22:10] mikko the problematic thing with udp is that you might connect to let's say 10 endpoints
[22:10] mikko and if 8 of them go down how do you know about it?
[22:11] mikko somehow i think in the context of zeromq tcp makes a lot more sense
[22:11] lechon hmm
[22:11] mikko like for example PUSH socket will load-balance between connections
[22:12] mikko and stops dispatching messages to peers that fail
[22:12] mikko with udp this semantic doesn't really work
[22:12] mikko unless you do explicit ACKs from the consumers
[22:15] lechon i see what you mean
[22:15] cremes i think for udp you would just say this is transport-specific behavior
[22:15] cremes delivery is best effort
[22:16] cremes if you start adding acks, you are duplicating tcp (and probably poorly)
[22:16] mikko i don't see whether there is that much merit to udp in message oriented communications
[22:16] lechon yeah. when you elect to use udp you probably have some out-of-channel way to determine when to stop sending to a particular host
[22:16] lechon its not zeromq's concern
[22:16] cremes right
[22:17] cremes as for keeping message delivery atomic, i think that would be transport-specific too
[22:17] cremes if a message part gets lost in delivery, drop the whole message
[22:17] lechon yep
[22:18] cremes udp packets can be up to 64k in length, right?
[22:19] lechon yes
[22:19] lechon including header
[22:19] cremes perhaps the udp transport could just coalesce all parts into one packet before sending, kind of like the nagle algo
[22:20] lechon what do you mean "all parts"?
[22:20] cremes and then set a max of 64k (minus headers) for messages using that transport
[22:20] cremes are you aware of the RCV_MORE and SND_MORE flags?
[22:20] mikko cremes: assuming the total size is < 64k
[22:20] lechon no :/
[22:20] cremes mikko: yes
[22:21] cremes lechon: check those out; they let you logically split up a message into parts for 0mq to deliver as an atomic whole
[22:21] cremes i.e. message-oriented streaming
[22:21] mikko the thing i am wondering is whether you actually need zeromq for udp? most of the functionality will be specifc to udp
[22:22] cremes 0mq still provides some neat abstractions for it
[22:22] mikko effectively you just need to frame the messages and you are about at the same point as you are with zeromq + udp
[22:22] lechon mikko, the zeromq interface is nice :]
[22:22] cremes though it's less useful for udp than other protocols
[22:24] lechon yes, coalescing all of the parts of those split messages would make sense
[22:26] lechon the receiving end would need to drop the entire message if it all of the parts couldn't fit into one udp packet and one got lost
[22:28] lechon when sending multipart messages like that does 0mq have a protocol for communicating the number of parts the receiver should expect?
[22:30] mikko looks like the udp transport earlier was epgm
[22:30] mikko looking at changelog
[22:30] mikko so there never was a real UDP transport
[22:33] lechon looks like it is used in resolve_nic_name for solaris/aix/hpux :P
[22:44] lechon or is the number of parts usually encoded in the tcp packets?
[22:46] mikko lechon: sorry?
[22:49] lechon i was referring to my previous question about how multipart messages are delivered with the existing transports
[22:50] traviscline ngerakines: general irc etiquette is to just ask, don't ask to ask
[22:50] traviscline ngerakines: hey sorry, accidentlly hit up-enter
[22:50] ngerakines np
[22:53] mikko lechon: the number of parts doesn't need to be known beforehand
[22:53] mikko on the sender size ZMQ_SNDMORE flag indicates that the next part will be a part of multipart message
[22:54] mikko a message after one or more ZMQ_SNDMORE sends terminates the multipart message
[22:54] mikko on the receivers size there ZMQ_RCVMORE which indicates whether more parts are coming
[22:56] lechon it is up to the user to make sure that the receiver calls RCVMORE the same number of times that the sender calls SNDMORE?
[22:56] mikko well, if the user wants to be aware of multipart messages
[22:56] mikko you could just receive a message at a time without having to acknowledge that it's actually one multipart message
[22:57] lechon that should work fine with udp and coalesced packets
[22:58] cremes lechon: yes, you loop on receiving msg parts until getsockopt(RCVMORE) becomes false
[22:59] mikko lechon: how do you maintain low latency in cases where you dont have constant throughput?
[22:59] mikko lets say user pushes 2K to the buffer. would there be a timer that waits for more?
[23:00] mikko well, a timer that triggers after certain period if no more messages come?
[23:00] lechon hmm, thats an interesting problem
[23:00] lechon it would have to be a user defined timer, and that might (?) be a messy interface change
[23:01] mikko probably not messy
[23:01] mikko sockopt that says the timeout
[23:01] mikko but any kind of timeout would probably introduce latency in environment where you are not constantly pushing message
[23:01] mikko s
[23:02] lechon for asynchronous sends it would be kind of easy. messages could be coalesced on the queue.
[23:04] lechon in many cases the user would probably not want additional latency... udp was selected for a reason
[23:04] mikko all sends are asyncronous in a way when dealing with zeromq
[23:04] mikko you are pushing the message to io thread rather than actually dealing with a socket
[23:05] lechon so sock:send(msg) just queues msg up and returns immediately?
[23:05] mikko lechon: there are different socket options affecting the behavior
[23:05] mikko and there are different behaviors with different socket types
[23:05] mikko in some cases it might block
[23:06] lechon ok. sorry, i haven't really used 0mq yet
[23:06] mikko what is the use-case you are looking udp for?
[23:07] mikko also, it might possibly be beneficial to look into UDT
[23:07] lechon given that, it might make sense to only attempt packet coalescing when in "async mode" and things can be grouped on the internal queue
[23:07] mikko as it seems to provide some semantics that are common for zeromq sockets
[23:07] lechon reliable, yuck :P
[23:08] lechon my use-case is streaming video over internet
[23:08] mikko a certain amount of reliability on transport layer makes sense with message oriented approach in my opinion
[23:08] mikko as you are not really dealing with packets or streams but rather with a concept of message
[23:11] lechon streams are just a sequence of messages, but i get your point
[23:11] mikko as in i assume if you stream video you are not actually dealing with messages but rather packets. and the next packet doesnt really depend on the previous packet
[23:12] mikko so in case of losing packet here or there the sound might jump a bit or so
[23:12] mikko but in case of a concept of message losing a packet means that your whole message is void
[23:12] mikko brb
[23:13] cremes i would only suggest msg part coalescing into 1 udp packet when someone is sending multipart messages
[23:14] cremes otherwise, just send them asap
[23:14] cremes i don't see why you would ever want a timer for sending udp
[23:15] lechon how do you know they are finished sending more?
[23:15] cremes lechon: how does who know? the receiver?
[23:16] cremes the sender knows because he doesn't pass the SNMORE flag to send
[23:16] cremes the receiver knows where the multipart msg ends because getsockopt(RCVMORE) returns false
[23:17] cremes lechon: you should really read the docs; a lot of this will get cleared up once you understand 0mq a bit better
[23:17] lechon there might be a large delay between the last send(SNDMORE) and the following send()
[23:18] cremes in that case, the i/o thread should keep the message parts in the queue until the last part is sent
[23:18] cremes 0mq is a message queue after all!
[23:18] lechon heh right, but the timeout could be used to expedite that, so things dont hang round in the queue for too long
[23:19] cremes then that breaks my suggestion for all msg parts to be coalesced into a single packet for udp transport
[23:19] cremes it also makes it harder on the receiving application; now it has to deal with fragmented messages
[23:20] cremes recall that udp doesn't guarantee order, so parts could also show up in a random order
[23:20] cremes i think forcing msg part coalescing to one packet for udp is a pretty good idea otherwise you break a lot of 0mq guarantees that other transports ge
[23:20] cremes s/ge/get
[23:21] cremes in that case, no timer is desired
[23:21] lechon even if you try to coalesce all parts into a single packet, you might not be able to... the sum of all the packets may be > 64k
[23:21] lechon all the parts*
[23:22] cremes true; so the udp transport would need to specify that 64k (minus headers) is the max message size *for that transport*
[23:22] lechon ok, sure
[23:22] cremes we've now come full circle; udp isn't a good choice for transport in 0mq
[23:23] cremes it will likely gain other "special" cases which will make it less suitable
[23:23] mikko 23:14 < cremes> i don't see why you would ever want a timer for sending udp
[23:23] mikko there might be processing in creating the parts
[23:23] cremes one of the great things about 0mq is being able to change transports without modifying your code
[23:23] mikko 1. send part with ZMQ_SNDMORE 2. process for 10 seconds 3. send the last part
[23:23] cremes mikko: tell me how the receiving end should handle missing msg parts
[23:24] mikko cremes: there are no missing parts there
[23:24] mikko cremes: but your problem is the sender
[23:24] lechon if udp drops one part
[23:24] cremes ok, then we're talking about different things
[23:24] mikko that would destroy the sequence
[23:24] cremes why do you want a timer?
[23:24] mikko cremes: if you want to coalesce into packets you need a timer
[23:24] cremes explain why
[23:25] mikko cremes: let's say i push 2KB to zmq_socket
[23:25] mikko there 62K in the packet left
[23:25] mikko right?
[23:25] cremes with you so far
[23:25] mikko now let's say i do processing for 10 seconds
[23:25] mikko and send 100K after that
[23:25] mikko is the 2KB pending until packet boundary is full?
[23:25] cremes is the 2kb its own message part or is it a piece of a multipart message?
[23:26] mikko either way
[23:26] lechon it should be part of a multipart for the argument to make sense i think.
[23:26] cremes if it is NOT a message part, it should be sent immediately; no reason to coalesce
[23:26] mikko so let's say it's part of multipart message
[23:27] mikko when does the 2K message leave the buffer?
[23:27] cremes then your example doesn't work; the second/last part (100k) exceeds the max size of a single udp packet
[23:27] mikko cremes: my message might be larger than udp packet
[23:27] cremes and i think it's a really bad idea to split 0mq messages over multiple udp pakcets
[23:28] cremes mikko: if it's too big, pick a different transport; udp is not a good choice
[23:28] mikko this is why i don't see udp as zeromq transport
[23:28] lechon what if the second message was 50K?
[23:28] cremes we agree then
[23:28] mikko it's still problematic
[23:28] mikko because if my second part 20K
[23:29] mikko when would my first 2K leave the buffer?
[23:29] mikko after the message sequence is complete?
[23:29] cremes yes
[23:29] lechon the idea is to never send multipart parts separately
[23:29] cremes udp transport would be limited to 64k packets; no other transport has a limitation like that
[23:30] mikko cremes: hence im wondering if it's aligned with the existing transports
[23:30] mikko UDT and SCTP seem to be more suitable imho but maybe im not seeing the benefits
[23:31] lechon why not allow all of the udp issues to leak to the user?
[23:32] lechon if things are out of order, users problem.
[23:32] lechon if a piece of the packet drops, users problem
[23:32] lechon etc.
[23:32] mikko lechon: because thats one of the thing zeromq2 does for you, abstracts all that away from user
[23:32] mikko well, not in all cases
[23:33] mikko but the idea is not to care about the network transport
[23:33] cremes exactly
[23:33] mikko rather think about in terms of messages
[23:34] lechon "unreliable messages" could be a way to think about it, but i see how that would not allow interchangeable transports (unless apps were written assuming unreliability)
[23:36] mikko maybe there is some merit to fire-and-forget type messages
[23:36] lechon another option could be to disallow multipart messages with udp
[23:37] mikko your message could still be very large
[23:38] mikko so the limitation of 64K per message + no multipart would require user to worry about transport type quite a lot
[23:38] lechon without multipart, would 64K be that big of a problem?
[23:38] lechon if one packet is dropped then the whole message is dropped
[23:39] mikko hmm
[23:40] mikko i don't know, somehow i don't see it fitting the model
[23:40] mikko but maybe have a chat with sustrik_ as well
[23:40] mikko i need to sleep in any case
[23:40] mikko good night
[23:43] lechon night
[23:56] mikko couldn't sleep
[23:56] mikko lechon: you could also send your ideas to mailing-list
[23:56] mikko for wider feedbackd