Friday January 14, 2011

[Time] NameMessage
[01:51] jugg yrashk, in 2.1, you can migrate a socket between threads, but you can't share the socket simultaneously in multiple threads.
[01:52] jugg and no need to reinitialize.
[06:20] potatodemon Hey Y'all. I want to have a pub/sub queue. Where these is no data persistance, if someone is not subscribed to the queue the data is not kept around. Is this easy to do with 0mq?
[06:22] potatodemon Oh this looks like it is the default behavior. word up
[08:28] sustrik yes, it's default
[08:28] sustrik yeashk: no need to re-initialise
[08:29] sustrik ah, jugg answered already :)
[09:30] yrashk jugg: thanks! already moved to two push sockets approach :)
[11:07] CIA-21 zeromq2: 03Martin Sustrik 07master * r8eae7d8 10/ (5 files):
[11:07] CIA-21 zeromq2: 'message distribution mechanism' separated from XPUB socket
[11:07] CIA-21 zeromq2: Signed-off-by: Martin Sustrik <> -
[11:15] mikko good morning
[11:17] sustrik morning
[11:26] CIA-21 zeromq2: 03Martin Sustrik 07master * r58c9830 10/ (src/xsub.cpp src/xsub.hpp):
[11:26] CIA-21 zeromq2: XSUB socket has a subscription distributor
[11:26] CIA-21 zeromq2: Signed-off-by: Martin Sustrik <> -
[11:37] CIA-21 zeromq2: 03Martin Sustrik 07master * ra348d94 10/ (src/xpub.cpp src/xpub.hpp):
[11:37] CIA-21 zeromq2: Fair queueing of subscriptions added to XPUB socket
[11:37] CIA-21 zeromq2: Signed-off-by: Martin Sustrik <> -
[11:40] CIA-21 zeromq2: 03Martin Sustrik 07master * r59fa0c9 10/ AUTHORS :
[11:40] CIA-21 zeromq2: Gerard Toonstra added to the authors file
[11:40] CIA-21 zeromq2: Signed-off-by: Martin Sustrik <> -
[11:50] guido_g howdy
[11:50] guido_g any idea what is the cause of: Assertion failed: pending_bytes == 0 (pgm_receiver.cpp:141)
[11:51] guido_g killing my subscriber randomly
[11:52] guido_g looks like a heisen bug
[12:02] sustrik guido_g: the assert shouldn't be there
[12:02] sustrik imo
[12:02] sustrik please, do report the problem on the mailing list so that steven can get a look at it
[12:17] guido_g ok
[12:30] benoitc mmm how would you authenticate consumers in a PUB/SUB schema ?
[12:31] guido_g over a different connection
[12:32] benoitc i'm not sur eto follow, if i setup a pub server
[12:32] benoitc how can i make sure only authentciated consumer can have access to it ?
[12:32] benoitc something in the message ?
[12:33] guido_g ømq does not provide authentication
[12:33] guido_g so you've to do it yourself
[12:33] benoitc i know
[12:33] benoitc i'm looking for ideas
[12:33] guido_g simple solution would be to use a req/rep
[12:34] benoitc right, it would mean a dedicaced connection in this case then
[12:34] guido_g for authentication
[12:35] benoitc mmm ?
[12:35] benoitc i don't follow
[12:35] jugg it'd be interesting if pub sockets could accept connections based on connecting sub socket identity.
[12:36] guido_g i doubt that
[12:36] jugg or reject them...
[12:36] guido_g see where amqp went with all these "nice features"
[12:37] jugg socket identity itself isn't a nice feature?
[12:37] guido_g <jugg> it'd be interesting if pub sockets could accept connections based on connecting sub socket identity.
[12:38] guido_g socket-ids are cheap
[12:38] guido_g authentication is not
[12:38] jugg thanks, I didn't realize I wrote that... :)
[12:38] guido_g just to give the context for my point
[12:39] benoitc well it could be just a filter on socket-id
[12:39] benoitc whatever you do with that
[12:39] jugg so, you'd reject subscription forwarding then? It isn't much different.
[12:40] guido_g yes
[12:40] jugg yes you'd reject it, or yes it is different?
[12:40] guido_g a)
[12:41] jugg anyway... the idea is still interesting, likable or not. :)
[12:43] jugg probably could just add 'subscribe' socket option to a pub socket, and only identities that match the pub subscriptions can connect. Too greatly overload the concept. :)
[12:44] guido_g how would this work w/ pgm?
[12:44] jugg I'll let someone else solve that :)
[12:45] guido_g i knew that...
[12:45] benoitc i guess i could encrypt messages in the pubserver
[12:45] guido_g sure
[12:45] benoitc so only authorized suscribers could read them
[12:46] guido_g decode
[12:46] benoitc yes
[12:46] guido_g the message itself would still be readable by not autorized clients
[12:46] benoitc right
[12:56] jugg if it is an issue of sensitive data, then you need to encrypt the connection anyway. I assumed you were solving an issue of limiting connections. So, however you encrypt the connection could also encapsulate the authentication.
[12:57] benoitc well not in a pubsub, i can't say don't send to this suscriber
[12:58] benoitc what i can do on the other hand is to let trusted suscribers to decode
[14:21] Steve-o so is anyone trying to write new transports for zmq?
[14:24] mikko Steve-o: sustrik maybe?
[14:24] mikko :)
[14:24] sustrik Steve-o: why so?
[14:25] sustrik Steve-o: btw, the problem reported by guido_g
[14:25] Steve-o just wondering
[14:25] sustrik what the semantics pending_bytes?
[14:25] sustrik of
[14:26] Steve-o zmq batches zmq messages into PGM packets
[14:26] Steve-o the pending_bytes refers to data remaining in the batch not sent to the application
[14:26] Steve-o I believe
[14:27] sustrik as for the transport it would be nice to make adding new ones but atm the interface between the trasnport and the rest of 0mq is kind of fuzzy
[14:27] sustrik adding new ones easy
[14:30] sustrik guido_g: is the problem reproducible?
[14:30] Steve-o something like a HTTP transport is quite complicated as you need timers and odd twist of push/pull characteristics unless you throw threads at it
[14:31] Steve-o I haven't seen guido's problem otherwise I would have logged it
[14:34] sustrik it's on the mailing list
[14:34] sustrik you've already replied
[14:35] mikko Steve-o: websocket transport might be a lot easier
[14:36] sustrik afaiu the problem is that if data cannot be sent to the application due to HWM or somesuch (pending_bytes>0)
[14:36] sustrik then pgm_receiver should unregister itself from polling on incoming data (reset_pollin)
[14:36] sustrik if it does so, in_event() should not be invoked
[14:36] sustrik and thus the assert doesn't happen
[14:37] sustrik there's a bug somewhere...
[14:38] sustrik mikko, Steve-o: HTTP, XMPP, SOAP and such are pretty complex, however, protocols like SCTP or UDP seem to make much better fit
[14:39] sustrik UDT i meant
[14:44] Steve-o HTTP is interesting as its really only the connection process that is different, the rest follows the TCP transport
[14:46] guido_g re
[14:46] guido_g sustrik: as I said, sometimes it happens sometimes not
[14:47] guido_g so there is no way to trigger that on demand
[14:50] guido_g msut be something in the packet because three independent subscribers crashed simultaniously
[14:50] guido_g *must
[14:52] Steve-o I guess you can tell if you had a dummy client that subs on the same transport but does no processing
[14:53] guido_g is in the working atm
[14:53] Steve-o Martin's implication is that the leading messages cause high CPU usage to cause the HWM to be hit
[14:57] Steve-o I presume then it would be straight forward to reproduce with a sleeping client and reasonably expedient publisher
[14:59] guido_g spoc works fine so far
[14:59] guido_g spoc := Simplest POssibile Client
[15:01] guido_g other client still dies
[15:03] guido_g what does this error mean?
[15:03] Steve-o as Martin says, it is due to HWM on receiving side to app'
[15:04] Steve-o i.e. slow consumer causes push from PGM to the app' the fail and leave unprocessed data
[15:04] guido_g pardon?
[15:05] Steve-o data is queuing up inside ZMQ for you app'
[15:05] guido_g ok
[15:05] Steve-o when it hits a certain level it pushes back to the transport
[15:05] guido_g also ok and understood
[15:06] Steve-o the transport, PGM, operates in batches of messages, one packet contains one batch
[15:06] guido_g ok
[15:06] Steve-o the queue limit is being reached before a packet batch of messages has completed
[15:07] Steve-o leaving unprocessed bytes (pending_bytes > 0)
[15:07] guido_g now it starts to make sense
[15:07] Steve-o a subsequent event indicating further new incoming data occurs
[15:08] Steve-o then the assertion is hit
[15:08] guido_g so one way to circumvent this problem would be a high HWM on the subscriber?
[15:09] Steve-o well the transport needs to unhook from future in_events until the queue hits the LWM I guess
[15:10] guido_g iow, no easy way work around that bug
[15:11] Steve-o higher watermark would be a workaround
[15:11] guido_g i'll give it a try
[15:12] guido_g does rate-limiting work on the receiver side?
[15:12] Steve-o send side
[15:13] Steve-o just looking at zmq receiver, it already does unhook some events
[15:14] Steve-o line 222-231,
[15:15] Steve-o maybe then its the timer event
[15:16] sustrik hm
[15:16] sustrik what is see is:
[15:16] sustrik pending_bytes = received - processed;
[15:16] sustrik ...
[15:16] sustrik reset_pollin (pipe_handle);
[15:16] sustrik reset_pollin (socket_handle);
[15:16] Steve-o slow consumer with waiting packets
[15:16] sustrik and the whole thing is under
[15:16] sustrik if (processed < received) {
[15:17] Steve-o but timer events call in_event()
[15:17] sustrik it looks like there's no guarantee that reset_pollin will be triggered iff pending_bytes>0
[15:17] sustrik the code is quite complex though...
[15:18] Steve-o therefore the timer must skip if the pollin is in reset state
[15:18] sustrik ah. maybe
[15:18] sustrik that should be easy to test
[15:18] Steve-o or simply add a cancel_timer with reset_pollin()
[15:19] sustrik or just add a bool variable 'pollin_active'
[15:19] sustrik and return immediately from in_event if it's false
[15:20] Steve-o canceling the timer would be cleaner
[15:20] sustrik sure
[15:21] Steve-o just copy the section in unplug()
[15:23] Steve-o ++229: if (has_rx_timer) { cancel_timer (rx_timer_id); has_rx_timer = false; }
[15:24] Steve-o although ideally the timer should live until the reset state clears
[15:25] Steve-o if there is no further incoming data but packets remain in the window they will not get flushed., depending the complex logic determing when in_event fires
[15:27] Steve-o as in, after the queue is cleared
[15:38] Steve-o but it's certainly a condition worthy of note, as without the assertion you have an incredibly high chance of data loss
[15:47] guido_g ok, a very high HWM on the subscriber side does the trick so far
[15:54] sustrik Steve-o: however, I assume the timer should be cancelled nontheless
[15:54] sustrik right?
[17:36] mikko sustrik: is PUSH socket supposed to block when HWM is reached even if ZMQ_NOBLOCK is used?
[18:39] sustrik mikko: no
[18:40] sustrik it should return EAGAIN
[18:53] mikko sustrik: ok, i wonder why i'm seeing this behavior
[18:53] mikko let me recheck the code
[18:55] mikko ah, found it
[18:55] sustrik what's going on?
[18:57] mikko launched on debugger and noticed that it's blocking on lingering messages rather than send ()
[18:57] mikko i'm adding a test case for hwm
[18:58] mikko hmm, looks like it depends differently depending on whether i bind or connect
[19:04] sustrik yes, they are different
[19:04] sustrik when you bind there's no pipe there
[19:04] sustrik so send blocks immediately
[19:05] sustrik with connect, the pipe is created straight away, so you can send messages immediately
[19:05] sustrik i mean, with bind the pipes are created as individual peers connect
[19:08] mikko sustrik: is that the same with inproc?
[19:09] sustrik yes, the same principle
[19:10] mikko is there any difference between HWM handling between transports / socket types (apart from inproc)
[19:10] sustrik it's an inherent issue, the HWM is per peer
[19:11] sustrik so it doesn't even make sense to talk about HWM until there is no peer connected
[19:11] sustrik it's the same for all transports
[19:12] sustrik although... pgm is kind of different
[19:12] sustrik on the pub side it treats all the subs as a single "connection"
[19:12] sustrik (it's multicast so that's the only way to do it)
[19:28] mikko sustrik: does the socket come out from exceptional state after all pending messages have been consumed?
[22:29] sustrik mikko: no, it comes out after LWM is reached
[22:43] mikko sustrik: noticed
[22:54] mikko sustrik:
[22:54] mikko shouldn't the SUB recv the second message?
[22:54] mikko or am i being silly again?