ZeroMq IRC Log

Sunday June 19, 2011

[Time] Name	Message
[07:27] CIA-32	libzmq: 03Martin Sustrik 07master * r4b60023 10/ (5 files in 2 dirs): Merge branch 'master' of github.com:zeromq/libzmq - http://bit.ly/iRCaAT
[07:27] CIA-32	libzmq: 03Martin Sustrik 07master * r5b77a41 10/ (perf/inproc_thr.cpp perf/local_thr.cpp perf/remote_thr.cpp): Throughput tests fixed. ...
[07:34] CIA-32	libzmq: 03Martin Sustrik 07master * r6052709 10/ src/tcp_connecter.cpp : ENETDOWN is a legal error from TCP connect ...
[09:18] CIA-32	libzmq: 03Martin Sustrik 07master * r00dc024 10/ src/pipe.cpp : Race condition in pipe_t fixed. ...
[10:26] fredix	hi
[10:27] fredix	I can't find an example with push/pull and load balancing
[10:37] mikko	fredix: the load balancing happens automatically between connected peers
[10:37] mikko	round-robin
[10:41] fredix	I try https://github.com/imatix/zguide/blob/master/examples/C++/taskwork.cpp with taskvent
[10:41] fredix	and 2 taskwork receive the message
[10:41] mikko	fredix: have you read the guide?
[10:41] mikko	there are a lot of load-balancing examples
[10:41] fredix	yes
[10:50] CIA-32	libzmq: 03Martin Sustrik 07master * r9f4d376 10/ src/session.cpp : Session termination error fixed ...
[10:54] fredix	ok
[10:54] fredix	taskvent loadbalance between worker
[10:56] fredix	but it's really strange that pub/sub cannot load balance between subscribers on the same filter
[11:37] sustrik	why should it?
[11:38] sustrik	pub/sub is for data distribution
[11:38] sustrik	not for load balancing
[11:38] sustrik	you can place a node in the middle that would subscribe for messages from upstream and load balance them to the downstream
[11:38] sustrik	if that's what you want
[12:56] fredix	sustrik, if you see my architecture maybe you should understand what I want : http://www.nodecast.net/images/architecture.png
[12:57] sustrik	hm, not really
[12:57] sustrik	anyway
[12:57] fredix	sustrik, :)
[12:57] sustrik	do you want to distribute or load-balance?
[12:57] fredix	sustrik, the dispatcher send a message on each channel
[12:58] sustrik	that's message distribution, ie. PUB/SUB
[12:58] fredix	sustrik, I can have many work listen a channel, but only one receive a message
[12:59] sustrik	so you have two-layered architecture
[12:59] fredix	sustrik, but with pub/sub zeromq all worker receive the same message
[12:59] sustrik	dispatcher sends messages to load-balancers
[12:59] sustrik	load-balancers load-balance them among workers
[13:00] fredix	mmm
[13:01] sustrik	each load-balancer instance represents what you call a "channel"
[13:03] fredix	the dispatcher and load balancer can it be in the same process ?
[13:04] sustrik	sure
[13:04] sustrik	that's what inproc transport is for
[13:04] fredix	mmm
[13:04] fredix	really powerfull
[13:04] fredix	but complicated :)
[14:40] CIA-32	libzmq: 03Martin Sustrik 07master * red680a3 10/ doc/zmq_socket.txt : Documentation for XPUB and XSUB socket added ...
[17:09] CIA-32	libzmq: 03Martin Sustrik 07master * r082f8e1 10/ src/mailbox.cpp : Mailbox timeouts fixed on Windows ...
[18:42] pieterh	sustrik: ping
[18:47] sustrik	pieterh: pong
[18:47] pieterh	hi martin, random question about tcp transport
[18:47] sustrik	yes?
[18:47] pieterh	I don't see any writev's
[18:47] sustrik	there are none
[18:47] pieterh	how are you batching writes...?
[18:48] sustrik	small messages are copied to a buffer and sent in a single go
[18:48] pieterh	ah, that makes sense
[18:48] sustrik	large messages are sent in place
[18:48] pieterh	any reason for not doing writev?
[18:48] sustrik	too much overhead
[18:48] sustrik	it's cheaper to copy small messages to a buffer
[18:48] sustrik	than constucting iovecs
[18:49] pieterh	I recall from openamq, for certain message flows it was much faster
[18:49] pieterh	but indeed, copying to a single buffer was even faster
[18:49] sustrik	we've measured it thoroughly back in 2008
[18:50] sustrik	writev seemed to give no improvement
[18:50] pieterh	do you have any documentation on the different optimizations that you studied at the time?
[18:50] pieterh	writev is definitely faster than multiple writes, for small messages
[18:50] sustrik	some of them, in no way all of them
[18:51] sustrik	we've tested several options a day
[18:51] sustrik	for several months
[18:52] pieterh	ok, np, was just curious (am making TCP VTX driver now)
[18:52] sustrik	the thumb of the rule is
[18:52] sustrik	copy small messages, process large messages in place
[18:52] sustrik	that's the reocurring pattern in HPC
[18:52] pieterh	right
[18:53] pieterh	clearly, given cycles to copy vs. cycles to process
[18:53] sustrik	exactly
[18:53] pieterh	regarding that 'separation' thread on email
[18:54] pieterh	I actually do have a kind of API between transport and socket patterns, emerging
[18:54] sustrik	goodo
[18:54] sustrik	i would like to have it there
[18:54] sustrik	but it's terribly complex atm
[18:54] pieterh	it's nothing at all like the 0MQ API, but it might form a basis for a future framework
[18:54] pieterh	it's complex, yes
[18:55] pieterh	it'll get simpler as I make more drivers
[18:55] sustrik	TCP is the most problematic one
[18:55] pieterh	why do you think that?
[18:55] sustrik	async connects, disconnects, reconnects
[18:56] pieterh	ah, like that
[18:56] pieterh	yes, UDP is very similar since I have a peering layer on top
[18:56] sustrik	too much async stuff happening at once
[18:56] pieterh	it's helpful to use a reactor model but I've not measured that performance yet
[18:57] sustrik	when identitites get into the mix it becomes insanely complex
[18:57] pieterh	indeed
[18:57] sustrik	reactor = event driven?
[18:57] pieterh	yes
[18:57] sustrik	yes, i use that internally
[18:57] pieterh	it lets one handle the async pieces neatly
[18:58] pieterh	it's still complex, IMO this will take two years at least to get into shape
[18:58] sustrik	presumably, no idea
[18:58] pieterh	with sufficient performance to be useful
[18:58] pieterh	if performance isn't a criteria, much faster of course
[18:59] sustrik	yes, without performance constraints it's a piece of cake
[18:59] pieterh	yes
[18:59] pieterh	well, I'm still doing it properly, just not optimized in any way
[18:59] pieterh	async connects etc.
[18:59] sustrik	great
[19:00] sustrik	the API would be the most important outcome
[19:00] sustrik	if you manage to get it done
[19:00] pieterh	I
[19:00] pieterh	I've abstracted the routing, that's simplest
[19:00] pieterh	not done exception handling yet
[19:01] sustrik	i would say the connection creation/teardown is the most scary stuff
[19:01] pieterh	hmm, that's ok, so far
[19:01] pieterh	I use a peering concept
[19:02] sustrik	the problems look like this for example:
[19:02] sustrik	imagine there's a TCP listener object listening on a port
[19:02] pieterh	sure
[19:02] sustrik	it gets a new connection
[19:02] sustrik	it creates a connection object
[19:02] pieterh	right
[19:03] sustrik	that runs in async way, maybe in a different thread
[19:03] sustrik	the connection object reads the identity
[19:03] pieterh	ah, I'm cheating :-)
[19:03] sustrik	finds the assciated session
[19:03] sustrik	(if one exists, if it does not it creates one)
[19:03] pieterh	right
[19:04] sustrik	the session may be running in a 3rd thread
[19:04] sustrik	so it has to migrate itself to a different thread
[19:04] pieterh	oh, I'm definitely cheating :-)
[19:04] sustrik	etc.
[19:04] pieterh	since 90% of use cases only need one thread for I/O
[19:04] sustrik	now imagine all the possible combinations that may happen during this process
[19:04] pieterh	and I can create multiple I/O threads by creating multiple driver instances
[19:04] pieterh	I do all this in a single thread
[19:05] sustrik	the problem is that you don't know in advance in which thread you are going to run
[19:05] sustrik	you have to read identity first
[19:05] pieterh	like I said, I'm cheating, I'm using exactly one thread
[19:05] pieterh	per driver instance
[19:05] sustrik	ok
[19:05] sustrik	it'll get more complex once you start making it multi-threaded
[19:06] pieterh	that would be, if you want multiple threads, you create multiple instances of a driver
[19:06] pieterh	tcp1://, tcp2:// etc.
[19:06] pieterh	it's only needed for extreme use cases
[19:06] pieterh	I didn't see any reason to make multithreaded drivers
[19:06] sustrik	sure, the problem is that the session persists between subsequent tcp connections
[19:06] pieterh	indeed
[19:06] sustrik	thus you have to migrate the driver
[19:06] pieterh	so more cheating: no identities
[19:06] sustrik	to the right thread
[19:07] pieterh	this is a good opportunity to experiment with "no identities" as a simplification
[19:07] sustrik	i've asked for that
[19:07] sustrik	everyone was like "no way"
[19:07] pieterh	no durable sockets, in any case
[19:07] pieterh	well, I like the simplification
[19:08] sustrik	so do i
[19:08] sustrik	but
[19:08] sustrik	backward compatibility
[19:08] pieterh	but like any change it has to be justified with pros and cons
[19:08] sustrik	not pissing of users
[19:08] sustrik	etc.
[19:08] pieterh	it's just a matter of demonstrating why it's worth it
[19:08] pieterh	let me give you an example...
[19:09] sustrik	tell that to someone using identities in production :\|
[19:09] pieterh	"here is a secure SASL TCP transport for #zeromq"
[19:09] pieterh	"hey, cool!"
[19:09] pieterh	"btw, it doesn't do explicit identities"
[19:09] pieterh	"who the heck cares, GIMME!"
[19:09] pieterh	eventually people will migrate use cases away from explicit identities
[19:10] pieterh	and then we can deprecate that, and then eventually remove it
[19:10] pieterh	durable sockets break driver single-threadedness
[19:10] pieterh	therefore VTX won't support them
[19:12] pieterh	We do need to gather use cases for explicit identities and see how otherwise to solve them
[19:12] pieterh	IMO it's like SWAP, it should be done at the device/broker level
[19:12] pieterh	we already do persistence decently at a higher level
[19:26] sustrik	<sustrik> sorry, got disconnected
[19:26] sustrik	<sustrik> <sustrik> the problem is that the infrastrucure is generic
[19:26] sustrik	<sustrik> <sustrik> so even with ideantity-less SASL transport
[19:26] sustrik	<sustrik> <sustrik> you would have to maintain identies beacuse of TCP transport
[19:26] sustrik	<sustrik> <sustrik> btw, subscription forwarding looks working ok
[19:26] sustrik	<sustrik> <sustrik> i would say it's time to start thinking about polishing 3.0 and releasing it
[19:37] jurica	what happens when a worker dies that still have tasks in his queue? are those tasks lost?
[19:38] sustrik	yes