ZeroMq IRC Log

Friday January 14, 2011

[Time] Name	Message
[01:51] jugg	yrashk, in 2.1, you can migrate a socket between threads, but you can't share the socket simultaneously in multiple threads.
[01:52] jugg	and no need to reinitialize.
[06:20] potatodemon	Hey Y'all. I want to have a pub/sub queue. Where these is no data persistance, if someone is not subscribed to the queue the data is not kept around. Is this easy to do with 0mq?
[06:22] potatodemon	Oh this looks like it is the default behavior. word up
[08:28] sustrik	yes, it's default
[08:28] sustrik	yeashk: no need to re-initialise
[08:29] sustrik	ah, jugg answered already :)
[09:30] yrashk	jugg: thanks! already moved to two push sockets approach :)
[11:07] CIA-21	zeromq2: 03Martin Sustrik 07master * r8eae7d8 10/ (5 files):
[11:07] CIA-21	zeromq2: 'message distribution mechanism' separated from XPUB socket
[11:07] CIA-21	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/eC63Tc
[11:15] mikko	good morning
[11:17] sustrik	morning
[11:26] CIA-21	zeromq2: 03Martin Sustrik 07master * r58c9830 10/ (src/xsub.cpp src/xsub.hpp):
[11:26] CIA-21	zeromq2: XSUB socket has a subscription distributor
[11:26] CIA-21	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/fifGK7
[11:37] CIA-21	zeromq2: 03Martin Sustrik 07master * ra348d94 10/ (src/xpub.cpp src/xpub.hpp):
[11:37] CIA-21	zeromq2: Fair queueing of subscriptions added to XPUB socket
[11:37] CIA-21	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/gZWsdO
[11:40] CIA-21	zeromq2: 03Martin Sustrik 07master * r59fa0c9 10/ AUTHORS :
[11:40] CIA-21	zeromq2: Gerard Toonstra added to the authors file
[11:40] CIA-21	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/fNsKW7
[11:50] guido_g	howdy
[11:50] guido_g	any idea what is the cause of: Assertion failed: pending_bytes == 0 (pgm_receiver.cpp:141)
[11:51] guido_g	killing my subscriber randomly
[11:52] guido_g	looks like a heisen bug
[12:02] sustrik	guido_g: the assert shouldn't be there
[12:02] sustrik	imo
[12:02] sustrik	please, do report the problem on the mailing list so that steven can get a look at it
[12:17] guido_g	ok
[12:30] benoitc	mmm how would you authenticate consumers in a PUB/SUB schema ?
[12:31] guido_g	over a different connection
[12:32] benoitc	i'm not sur eto follow, if i setup a pub server
[12:32] benoitc	how can i make sure only authentciated consumer can have access to it ?
[12:32] benoitc	something in the message ?
[12:33] guido_g	Ã¸mq does not provide authentication
[12:33] guido_g	so you've to do it yourself
[12:33] benoitc	i know
[12:33] benoitc	i'm looking for ideas
[12:33] guido_g	simple solution would be to use a req/rep
[12:34] benoitc	right, it would mean a dedicaced connection in this case then
[12:34] guido_g	for authentication
[12:35] benoitc	mmm ?
[12:35] benoitc	i don't follow
[12:35] jugg	it'd be interesting if pub sockets could accept connections based on connecting sub socket identity.
[12:36] guido_g	i doubt that
[12:36] jugg	or reject them...
[12:36] guido_g	see where amqp went with all these "nice features"
[12:37] jugg	socket identity itself isn't a nice feature?
[12:37] guido_g	<jugg> it'd be interesting if pub sockets could accept connections based on connecting sub socket identity.
[12:38] guido_g	socket-ids are cheap
[12:38] guido_g	authentication is not
[12:38] jugg	thanks, I didn't realize I wrote that... :)
[12:38] guido_g	just to give the context for my point
[12:39] benoitc	well it could be just a filter on socket-id
[12:39] benoitc	whatever you do with that
[12:39] jugg	so, you'd reject subscription forwarding then? It isn't much different.
[12:40] guido_g	yes
[12:40] jugg	yes you'd reject it, or yes it is different?
[12:40] guido_g	a)
[12:41] jugg	anyway... the idea is still interesting, likable or not. :)
[12:43] jugg	probably could just add 'subscribe' socket option to a pub socket, and only identities that match the pub subscriptions can connect. Too greatly overload the concept. :)
[12:44] guido_g	how would this work w/ pgm?
[12:44] jugg	I'll let someone else solve that :)
[12:45] guido_g	i knew that...
[12:45] benoitc	i guess i could encrypt messages in the pubserver
[12:45] guido_g	sure
[12:45] benoitc	so only authorized suscribers could read them
[12:46] guido_g	decode
[12:46] benoitc	yes
[12:46] guido_g	the message itself would still be readable by not autorized clients
[12:46] benoitc	right
[12:56] jugg	if it is an issue of sensitive data, then you need to encrypt the connection anyway. I assumed you were solving an issue of limiting connections. So, however you encrypt the connection could also encapsulate the authentication.
[12:57] benoitc	well not in a pubsub, i can't say don't send to this suscriber
[12:58] benoitc	what i can do on the other hand is to let trusted suscribers to decode
[14:21] Steve-o	so is anyone trying to write new transports for zmq?
[14:24] mikko	Steve-o: sustrik maybe?
[14:24] mikko	:)
[14:24] sustrik	Steve-o: why so?
[14:25] sustrik	Steve-o: btw, the problem reported by guido_g
[14:25] Steve-o	just wondering
[14:25] sustrik	what the semantics pending_bytes?
[14:25] sustrik	of
[14:26] Steve-o	zmq batches zmq messages into PGM packets
[14:26] Steve-o	the pending_bytes refers to data remaining in the batch not sent to the application
[14:26] Steve-o	I believe
[14:27] sustrik	as for the transport it would be nice to make adding new ones but atm the interface between the trasnport and the rest of 0mq is kind of fuzzy
[14:27] sustrik	adding new ones easy
[14:30] sustrik	guido_g: is the problem reproducible?
[14:30] Steve-o	something like a HTTP transport is quite complicated as you need timers and odd twist of push/pull characteristics unless you throw threads at it
[14:31] Steve-o	I haven't seen guido's problem otherwise I would have logged it
[14:34] sustrik	it's on the mailing list
[14:34] sustrik	you've already replied
[14:35] mikko	Steve-o: websocket transport might be a lot easier
[14:36] sustrik	afaiu the problem is that if data cannot be sent to the application due to HWM or somesuch (pending_bytes>0)
[14:36] sustrik	then pgm_receiver should unregister itself from polling on incoming data (reset_pollin)
[14:36] sustrik	if it does so, in_event() should not be invoked
[14:36] sustrik	and thus the assert doesn't happen
[14:37] sustrik	there's a bug somewhere...
[14:38] sustrik	mikko, Steve-o: HTTP, XMPP, SOAP and such are pretty complex, however, protocols like SCTP or UDP seem to make much better fit
[14:39] sustrik	UDT i meant
[14:44] Steve-o	HTTP is interesting as its really only the connection process that is different, the rest follows the TCP transport
[14:46] guido_g	re
[14:46] guido_g	sustrik: as I said, sometimes it happens sometimes not
[14:47] guido_g	so there is no way to trigger that on demand
[14:50] guido_g	msut be something in the packet because three independent subscribers crashed simultaniously
[14:50] guido_g	*must
[14:52] Steve-o	I guess you can tell if you had a dummy client that subs on the same transport but does no processing
[14:53] guido_g	is in the working atm
[14:53] Steve-o	Martin's implication is that the leading messages cause high CPU usage to cause the HWM to be hit
[14:57] Steve-o	I presume then it would be straight forward to reproduce with a sleeping client and reasonably expedient publisher
[14:59] guido_g	spoc works fine so far
[14:59] guido_g	spoc := Simplest POssibile Client
[15:01] guido_g	other client still dies
[15:03] guido_g	what does this error mean?
[15:03] Steve-o	as Martin says, it is due to HWM on receiving side to app'
[15:04] Steve-o	i.e. slow consumer causes push from PGM to the app' the fail and leave unprocessed data
[15:04] guido_g	pardon?
[15:05] Steve-o	data is queuing up inside ZMQ for you app'
[15:05] guido_g	ok
[15:05] Steve-o	when it hits a certain level it pushes back to the transport
[15:05] guido_g	also ok and understood
[15:06] Steve-o	the transport, PGM, operates in batches of messages, one packet contains one batch
[15:06] guido_g	ok
[15:06] Steve-o	the queue limit is being reached before a packet batch of messages has completed
[15:07] Steve-o	leaving unprocessed bytes (pending_bytes > 0)
[15:07] guido_g	now it starts to make sense
[15:07] Steve-o	a subsequent event indicating further new incoming data occurs
[15:08] Steve-o	then the assertion is hit
[15:08] guido_g	so one way to circumvent this problem would be a high HWM on the subscriber?
[15:09] Steve-o	well the transport needs to unhook from future in_events until the queue hits the LWM I guess
[15:10] guido_g	iow, no easy way work around that bug
[15:11] Steve-o	higher watermark would be a workaround
[15:11] guido_g	i'll give it a try
[15:12] guido_g	does rate-limiting work on the receiver side?
[15:12] Steve-o	send side
[15:13] Steve-o	just looking at zmq receiver, it already does unhook some events
[15:14] Steve-o	line 222-231,
[15:15] Steve-o	maybe then its the timer event
[15:16] sustrik	hm
[15:16] sustrik	what is see is:
[15:16] sustrik	pending_bytes = received - processed;
[15:16] sustrik	...
[15:16] sustrik	reset_pollin (pipe_handle);
[15:16] sustrik	reset_pollin (socket_handle);
[15:16] Steve-o	slow consumer with waiting packets
[15:16] sustrik	and the whole thing is under
[15:16] sustrik	if (processed < received) {
[15:17] Steve-o	but timer events call in_event()
[15:17] sustrik	it looks like there's no guarantee that reset_pollin will be triggered iff pending_bytes>0
[15:17] sustrik	the code is quite complex though...
[15:18] Steve-o	therefore the timer must skip if the pollin is in reset state
[15:18] sustrik	ah. maybe
[15:18] sustrik	that should be easy to test
[15:18] Steve-o	or simply add a cancel_timer with reset_pollin()
[15:19] sustrik	or just add a bool variable 'pollin_active'
[15:19] sustrik	and return immediately from in_event if it's false
[15:20] Steve-o	canceling the timer would be cleaner
[15:20] sustrik	sure
[15:21] Steve-o	just copy the section in unplug()
[15:23] Steve-o	++229: if (has_rx_timer) { cancel_timer (rx_timer_id); has_rx_timer = false; }
[15:24] Steve-o	although ideally the timer should live until the reset state clears
[15:25] Steve-o	if there is no further incoming data but packets remain in the window they will not get flushed., depending the complex logic determing when in_event fires
[15:27] Steve-o	as in, after the queue is cleared
[15:38] Steve-o	but it's certainly a condition worthy of note, as without the assertion you have an incredibly high chance of data loss
[15:47] guido_g	ok, a very high HWM on the subscriber side does the trick so far
[15:54] sustrik	Steve-o: however, I assume the timer should be cancelled nontheless
[15:54] sustrik	right?
[17:36] mikko	sustrik: is PUSH socket supposed to block when HWM is reached even if ZMQ_NOBLOCK is used?
[18:39] sustrik	mikko: no
[18:40] sustrik	it should return EAGAIN
[18:53] mikko	sustrik: ok, i wonder why i'm seeing this behavior
[18:53] mikko	let me recheck the code
[18:55] mikko	ah, found it
[18:55] sustrik	what's going on?
[18:57] mikko	launched on debugger and noticed that it's blocking on lingering messages rather than send ()
[18:57] mikko	i'm adding a test case for hwm
[18:58] mikko	hmm, looks like it depends differently depending on whether i bind or connect
[19:04] sustrik	yes, they are different
[19:04] sustrik	when you bind there's no pipe there
[19:04] sustrik	so send blocks immediately
[19:05] sustrik	with connect, the pipe is created straight away, so you can send messages immediately
[19:05] sustrik	i mean, with bind the pipes are created as individual peers connect
[19:08] mikko	sustrik: is that the same with inproc?
[19:09] sustrik	yes, the same principle
[19:10] mikko	is there any difference between HWM handling between transports / socket types (apart from inproc)
[19:10] sustrik	it's an inherent issue, the HWM is per peer
[19:11] sustrik	so it doesn't even make sense to talk about HWM until there is no peer connected
[19:11] sustrik	it's the same for all transports
[19:12] sustrik	although... pgm is kind of different
[19:12] sustrik	on the pub side it treats all the subs as a single "connection"
[19:12] sustrik	(it's multicast so that's the only way to do it)
[19:28] mikko	sustrik: does the socket come out from exceptional state after all pending messages have been consumed?
[22:29] sustrik	mikko: no, it comes out after LWM is reached
[22:43] mikko	sustrik: noticed
[22:54] mikko	sustrik: https://gist.github.com/cd0cd7d8407c30456c23
[22:54] mikko	shouldn't the SUB recv the second message?
[22:54] mikko	or am i being silly again?